Just can't get it to work

#1
by huggingworld - opened

loads all models and than console error of Error: Quota exceeded.

IBM Granite org

You are running out of GPU memory on either your browser or device. Please point your browser to webgpureport.org and look for the following two entries under limits:

  • maxBufferSize: 4294967296 (4gb)
  • maxStorageBufferBindingSize: 4294967292 (4gb)

thanks for getting back, I can run https://huggingface.co/spaces/webml-community/Nemotron-3-Nano-WebGPU and https://huggingface.co/spaces/webml-community/granite-4.0-1b-speech-webgpu.

So not sure, if you can check the app, to make sure it still works.

IBM Granite org

Hi @huggingworld . We checked that the demo runs correctly on our side. We also switched to q4f16 quantization for the audio encoder which saves 200Mb. Please check again from a computer/browser with similar limits as mentioned above.

huggingworld changed discussion status to closed

Was using the wrong browser!
Works fine, q4f16 and q4, in Chrome 146
Errors in Chrome Canary 148.
Not related to GPU memory.
Really cool app! ๐Ÿ˜ƒ

huggingworld changed discussion status to open
huggingworld changed discussion status to closed

Sign up or log in to comment