ollama failed deploying

#9
by Misaka19999 - opened

This model runs perfectly on LM Studio but failed on ollama, promopted as:
PS C:\Users\zgsxt> ollama run hf.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive:Q4_K_M --verbose
Error: 500 Internal Server Error: unable to load model: C:\Users\zgsxt.ollama\models\blobs\sha256-c117a47c5d8d1bb91d68031aaa77891f10118338e1174accc48c55ee3fff8717
SHA256 check has been passed, have tried both Q4 and Q8, vram is adjustable from 32G to 96G but still doesn't work. Any help or suggestion would be appreciated!

Ollama, as of now (v0.18.1) still does not support third party GGUFs with the architecture qwen35/qwen35moe (of course, it works with the quants downloaded from their own repo). For now, you can create a custom Modelfile for text generation only (no vision), something like this:

# point to downloaded GGUF
FROM ./Qwen3.5-35B-A3B.Q4_K_M.gguf

# use Ollama engine
TEMPLATE {{ .Prompt }}
RENDERER qwen3.5
PARSER qwen3.5

# suggested parameters for the official model, you may tweak it
PARAMETER top_p 0.95
PARAMETER presence_penalty 1.5
PARAMETER temperature 1
PARAMETER top_k 20

LICENSE "                                Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/
"

Full support for third party GGUFs should be added once a PR that updates llama.cpp is merged.

Sign up or log in to comment