Just can't get it to work
loads all models and than console error of Error: Quota exceeded.
You are running out of GPU memory on either your browser or device. Please point your browser to webgpureport.org and look for the following two entries under limits:
- maxBufferSize: 4294967296 (4gb)
- maxStorageBufferBindingSize: 4294967292 (4gb)
thanks for getting back, I can run https://huggingface.co/spaces/webml-community/Nemotron-3-Nano-WebGPU and https://huggingface.co/spaces/webml-community/granite-4.0-1b-speech-webgpu.
So not sure, if you can check the app, to make sure it still works.
Hi @huggingworld . We checked that the demo runs correctly on our side. We also switched to q4f16 quantization for the audio encoder which saves 200Mb. Please check again from a computer/browser with similar limits as mentioned above.
Was using the wrong browser!
Works fine, q4f16 and q4, in Chrome 146
Errors in Chrome Canary 148.
Not related to GPU memory.
Really cool app! ๐