Thanks. This is by far the best denial stripping I've ever seen.

#30

by phil111 - opened 1 day ago

All other abliterated models I've tested, not just with Qwen3.5, performed much worse on my broad knowledge test. Apparently something about the process of removing denials seems to be scrambling the weights and causing a spike in hallucinations at the horizon of knowledge.

However, this one scored only a couple points lower than source model and remained just as coherent and skillful during the test. The small difference doesn't really matter since Qwen3.5 is, outside of STEM, broadly ignorant anyways and only scored 64/100 on my broad knowledge test, while other similarly sized models like Gemma 3 27b and Mistral Small 22b scored 74 or higher, so using Qwen3.5 for broad knowledge doesn't make much sense anyways.

I won't ask you how you did it, but I'm very impressed. You managed to do something nobody else was able to do and I previously thought impossible. Thanks.

whoisjeremylam

1 day ago

My current use cases have no specific need for uncensoring but with some tests, I'm very impressed.

phil111

1 day ago

@whoisjeremylam Same here. I just test for needless alignment, such as refusing to output emojis representing body parts, using a common cuss word in a story, and attempting a joke at the expense of a public figure. Basically all things nearly all humans do, hence it makes absolutely no sense of AI to respond with a denial. I don't ask about things most humans have no interest in, such as creating illegal substances or how to harm others.

So the widespread and aggressive denials Qwen3.5 gives are not remotely reasonable. It will even refuse to respond to minor things if you have a system prompt that subtly says stop acting like the thought police. And once you remove said system prompt it will respond. Plus this second guessing of the system prompt is apparently degrading performance because it performs less reliably with any system prompt.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment