Are there any more metrics or articles on how this compares to other models for streaming?

#13
by minimanatee - opened

The linked articles of real-world use cases make this seem very promising. The 24ms on RTX 5090 is also very impressive, but that seems to only be end-of-turn transcript finalization. Are there any other data points on how this fares compared to the other parakeet model in terms of time to partial, specifically for streaming?

i am wondering the same thing. the arxiv paper does not report true time to partial, only the minimum possible average latency that a partial could have. in my experiments the average time to partial is at least 200ms even in the most aggressive configuration.

Sign up or log in to comment