Instructions to use nvidia/nemotron-speech-streaming-en-0.6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use nvidia/nemotron-speech-streaming-en-0.6b with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/nemotron-speech-streaming-en-0.6b") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
Multilingual version planned?
Thanks for sharing this great model! I‘m already looking for a long time for a fast real streaming model. Is there a multilingual version planned? Especially German and European languages? Whats the rough timeframe? :) Thanks a lot!
Hi @fosple , yes, we’re planning a multilingual version that will cover German and European languages. Will share more updates over the coming months!
For English, is this any better than Parakeet V0.2/0.3? (Not streaming scenario; need highest-quality transcripts)
Hey @kunaldhawan thanks for the model and the cache-aware streaming. The technical blog was super informative.
Just wondering if there are any plans to add support for Indian Languages, especially Hindi/Hinglish? Thanks!
Hey @pushkar-nurix you can try this model out in the mean time, its not Hinglish but supports Hindi very accurately for now , its based on nemo architecture, we are working and do have a plan to release Hinglish model and cache aware models as well in a few weeks
https://huggingface.co/spaces/RinggAI/STT
@kunaldhawan is there by any chance we can collaborate on releasing a subset of our Indic ASR models , together sometime in future
@kunaldhawan Hi kunal , regarding your post on Indic ASR—I'm currently working on smart-tuning techniques. I think this could be really useful for the Hinglish models you plan to release. I'd love to collaborate on this.
@kunaldhawan Hi, any chance to share training recipes, and how to work on tokenizer for new language?
I am planning to experiment using arabic datasets.
@kunaldhawan Hi, any chance to share training recipes, and how to work on tokenizer for new language?
I am planning to experiment using arabic datasets.
Hi @gnomefin , you can refer to the following NeMo scripts and configs to train or fine-tune the model and to build a tokenizer for a new language:
- Training script: https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/asr_transducer/speech_to_text_rnnt_bpe.py
- Cache-aware streaming config: https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/conf/fastconformer/cache_aware_streaming/fastconformer_ctc_bpe_streaming.yaml
- Tokenizer training on new data: https://github.com/NVIDIA-NeMo/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py
@kunaldhawan is there by any chance we can collaborate on releasing a subset of our Indic ASR models , together sometime in future
@kunaldhawan Hi kunal , regarding your post on Indic ASR—I'm currently working on smart-tuning techniques. I think this could be really useful for the Hinglish models you plan to release. I'd love to collaborate on this.
That sounds great, @harsh2ai and @vrushildavra . Please feel free to open an issue or PR in the NeMo repository and cc me. I’d be happy to collaborate.
Just wondering if there are any plans to add support for Indian Languages, especially Hindi/Hinglish? Thanks!
Thanks for the feedback, @pushkar-nurix . Support for Indian languages is definitely something we’re interested in, and we’ll look into adding this to the roadmap for upcoming releases.
high how much data is required for finetune on new language. if I used English pretrained weights
Hey @kunaldhawan I have finetuned a new model on hindi and would like to contribute it to the Nvidia repo you had previously mentioned
That sounds great, @harsh2ai and @vrushildavra . Please feel free to open an issue or PR in the NeMo repository and cc me. I’d be happy to collaborate.
you can check this out on https://huggingface.co/SkunkWorkLabs/varuna-stt its built on same base arch of streaming-en-0.6b I am raising a pr on github as well you can refer to https://github.com/NVIDIA-NeMo/NeMo/issues/15664
looking forward to contributing
WIP https://huggingface.co/nvidia/NVIDIA-Nemotron-3.5-ASR-Streaming-Multilingual-0.6b
Hello, could you list supported languages of this upcoming model?