Multilingual version planned?

by fosple - opened Jan 5

Jan 5

•

Thanks for sharing this great model! I‘m already looking for a long time for a fast real streaming model. Is there a multilingual version planned? Especially German and European languages? Whats the rough timeframe? :) Thanks a lot!

kunaldhawan

NVIDIA org Jan 6

•

edited Jan 6

Hi @fosple , yes, we’re planning a multilingual version that will cover German and European languages. Will share more updates over the coming months!

yukiarimo

Jan 6

For English, is this any better than Parakeet V0.2/0.3? (Not streaming scenario; need highest-quality transcripts)

pushkar-nurix

Jan 7

•

edited Jan 7

Hey @kunaldhawan thanks for the model and the cache-aware streaming. The technical blog was super informative.

Just wondering if there are any plans to add support for Indian Languages, especially Hindi/Hinglish? Thanks!

harsh2ai

Jan 7

Hey @pushkar-nurix you can try this model out in the mean time, its not Hinglish but supports Hindi very accurately for now , its based on nemo architecture, we are working and do have a plan to release Hinglish model and cache aware models as well in a few weeks

https://huggingface.co/spaces/RinggAI/STT

@kunaldhawan is there by any chance we can collaborate on releasing a subset of our Indic ASR models , together sometime in future

vrushildavra

Jan 7

@kunaldhawan Hi kunal , regarding your post on Indic ASR—I'm currently working on smart-tuning techniques. I think this could be really useful for the Hinglish models you plan to release. I'd love to collaborate on this.

gnomefin

Jan 9

@kunaldhawan Hi, any chance to share training recipes, and how to work on tokenizer for new language?
I am planning to experiment using arabic datasets.

kunaldhawan

NVIDIA org Jan 12

@kunaldhawan Hi, any chance to share training recipes, and how to work on tokenizer for new language?
I am planning to experiment using arabic datasets.

Hi @gnomefin , you can refer to the following NeMo scripts and configs to train or fine-tune the model and to build a tokenizer for a new language:

Training script: https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/asr_transducer/speech_to_text_rnnt_bpe.py
Cache-aware streaming config: https://github.com/NVIDIA-NeMo/NeMo/blob/main/examples/asr/conf/fastconformer/cache_aware_streaming/fastconformer_ctc_bpe_streaming.yaml
Tokenizer training on new data: https://github.com/NVIDIA-NeMo/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py

@kunaldhawan is there by any chance we can collaborate on releasing a subset of our Indic ASR models , together sometime in future
@kunaldhawan Hi kunal , regarding your post on Indic ASR—I'm currently working on smart-tuning techniques. I think this could be really useful for the Hinglish models you plan to release. I'd love to collaborate on this.

That sounds great, @harsh2ai and @vrushildavra . Please feel free to open an issue or PR in the NeMo repository and cc me. I’d be happy to collaborate.

Just wondering if there are any plans to add support for Indian Languages, especially Hindi/Hinglish? Thanks!

Thanks for the feedback, @pushkar-nurix . Support for Indian languages is definitely something we’re interested in, and we’ll look into adding this to the roadmap for upcoming releases.

teeofftechnologies

Jan 22

high how much data is required for finetune on new language. if I used English pretrained weights

nabil6391

Apr 7

@kunaldhawan , will there be any multilingual model supporting Arabic in the upcoming months?

harsh2ai

16 days ago

Hey @kunaldhawan I have finetuned a new model on hindi and would like to contribute it to the Nvidia repo you had previously mentioned

That sounds great, @harsh2ai and @vrushildavra . Please feel free to open an issue or PR in the NeMo repository and cc me. I’d be happy to collaborate.
you can check this out on https://huggingface.co/SkunkWorkLabs/varuna-stt its built on same base arch of streaming-en-0.6b I am raising a pr on github as well you can refer to https://github.com/NVIDIA-NeMo/NeMo/issues/15664

looking forward to contributing

Amargolin

NVIDIA org 4 days ago

WIP https://huggingface.co/nvidia/NVIDIA-Nemotron-3.5-ASR-Streaming-Multilingual-0.6b

slavikz

2 days ago

WIP https://huggingface.co/nvidia/NVIDIA-Nemotron-3.5-ASR-Streaming-Multilingual-0.6b

Hello, could you list supported languages of this upcoming model?

Amargolin

NVIDIA org about 19 hours ago

@slavikz : Same languages as in tdt-v3 (EU) + popular languages. Example : pt-BR, ar-AR, ja-JP, ko-KR, hi-IN, zh-CN, vi-VN, he-IL, tr-TR, th-TH and more...
Any specific language you are waiting for?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment