Vector-AIXPERT
Collection
8 items • Updated
A fine-tuned version of Qwen3-8B for news media bias detection and neutral rewriting, developed by the Vector Institute as part of the UnBias-Plus project.
Given a news article, the model identifies biased language segments, classifies their bias type and severity, provides neutral replacements, and returns a fully rewritten unbiased version of the article — all in a single structured JSON response.
| Property | Value |
|---|---|
| Base model | Qwen/Qwen3-8B |
| Fine-tuning method | Supervised Fine-Tuning (SFT) with LoRA |
| Training precision | bf16 (full precision, no quantization during training) |
| LoRA rank | 16 |
| Training framework | Unsloth + TRL |
| Context length | 8192 tokens |
| Output format | Structured JSON |
from unsloth import FastLanguageModel
import torch, json
model, tokenizer = FastLanguageModel.from_pretrained(
"vector-institute/Qwen3-8B-UnBias-Plus-SFT",
max_seq_length=8192,
load_in_4bit=False, # set True for ~5GB VRAM (laptop)
dtype=torch.bfloat16,
)
FastLanguageModel.for_inference(model)
SYSTEM_PROMPT = """You are an expert linguist and bias detection specialist.
Your task is to carefully read a news article, detect ALL biased language,
and return a structured JSON response. Return ONLY valid JSON, no extra text."""
article = "Your news article here..."
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Analyze the following article for bias and return the result in the required JSON format.\n\nARTICLE:\n{article}"},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
enable_thinking=True,
return_tensors="pt",
return_dict=True,
)
outputs = model.generate(
input_ids=inputs["input_ids"].to("cuda"),
attention_mask=inputs["attention_mask"].to("cuda"),
max_new_tokens=4096,
temperature=0.1,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
new_tokens = outputs[0][inputs["input_ids"].shape[1]:]
response = tokenizer.decode(new_tokens, skip_special_tokens=True)
# Extract JSON — strip thinking block if present
if "</think>" in response:
response = response.split("</think>", 1)[-1].strip()
result = json.loads(response)
{
"binary_label": "biased" | "unbiased",
"severity": 0 | 2 | 3 | 4,
"bias_found": true | false,
"biased_segments": [
{
"original": "exact substring from input article",
"replacement": "neutral alternative phrase",
"severity": "high" | "medium" | "low",
"bias_type": "loaded language | dehumanizing framing | false generalizations | framing bias | euphemism/dysphemism | politically charged terminology | sensationalism",
"reasoning": "1-2 sentence explanation"
}
],
"unbiased_text": "Full rewritten neutral article"
}
| Value | Meaning |
|---|---|
| 0 | Neutral — no bias detected |
| 2 | Recurring biased framing |
| 3 | Strong persuasive tone |
| 4 | Inflammatory rhetoric |
Fine-tuned on vector-institute/Unbias-plus, a curated dataset of news articles with expert-annotated bias labels, segment-level annotations, and neutral rewrites.
| Setup | Configuration |
|---|---|
| Recommended (server) | load_in_4bit=False, dtype=torch.bfloat16 (~16GB VRAM) |
| Lightweight (laptop) | load_in_4bit=True (~5GB VRAM) |
| 4B | 8B | |
|---|---|---|
| VRAM (bf16) | ~8GB | ~16GB |
| VRAM (4-bit) | ~3GB | ~5GB |
| Speed | Faster | Slower |
| Quality | Strong | Higher accuracy on complex articles |
| Recommended for | Laptops, real-time APIs | Servers, batch processing |
If you use this model in your research or application, please cite:
@misc{unbias-plus-8b,
title = {Qwen3-8B-UnBias-Plus-SFT},
author = {Vector Institute},
year = {2026},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/vector-institute/Qwen3-8B-UnBias-Plus-SFT}},
note = {Part of the UnBias-Plus project: https://github.com/VectorInstitute/unbias-plus}
}