Qwen3.5-2B Chat LoRA (Ultrachat SFT)

Overview

  • LoRA fine-tuned model based on Qwen/Qwen3.5-2B
  • Trained for chat-style instruction following
  • Uses conversational formatting (<|user|>, <|assistant|>)
  • Optimized for fast training with a reduced dataset subset

Important

  • This repository contains LoRA adapter weights only
  • Base model is NOT included
  • Required base model: Qwen/Qwen3.5-2B

Usage

  • Load base model and adapter
  • Use chat template for proper responses
from transformers import AutoTokenizer, AutoModelForCausalLM  
from peft import PeftModel  

base_model = "Qwen/Qwen3.5-2B"  
adapter = "your-username/your-repo-name"  

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)  

model = AutoModelForCausalLM.from_pretrained(  
    base_model,  
    torch_dtype=torch.float16,  
    device_map="auto",  
    trust_remote_code=True  
)  

model = PeftModel.from_pretrained(model, adapter)  
model.eval()  

messages = [  
    {"role": "user", "content": "Explain gravity briefly"}  
]  

text = tokenizer.apply_chat_template(  
    messages,  
    tokenize=False,  
    add_generation_prompt=True  
)  

inputs = tokenizer(text, return_tensors="pt").to(model.device)  

outputs = model.generate(  
    **inputs,  
    max_new_tokens=400,  
    do_sample=True,  
    temperature=0.7,  
    top_p=0.9,  
    eos_token_id=tokenizer.eos_token_id  
)  

response = tokenizer.decode(outputs[0], skip_special_tokens=True)  

if "</think>" in response:  
    response = response.split("</think>")[-1]  

response = response.replace(text, "").strip()  

print(response)

Training Details

  • Base Model: Qwen/Qwen3.5-2B
  • Method: LoRA (PEFT)
  • Framework: Transformers + TRL (SFTTrainer)
  • Dataset: HuggingFaceH4/ultrachat_200k (train_sft split)
  • Data Used: 10,000 samples (subset for fast training)
  • Task: Chat-style supervised fine-tuning

Hyperparameters

  • Max Steps: 1000
  • Max Sequence Length: 1024
  • Batch Size: 4
  • Gradient Accumulation: 4
  • Learning Rate: 2e-4
  • Precision: bfloat16
  • Gradient Checkpointing: Enabled

Capabilities

  • Multi-turn conversational responses
  • Instruction-following chat behavior
  • Handles both short and long-form answers
  • Supports English and Hindi inputs (inherited from base model)

Limitations

  • Trained on a subset (10k samples), not full dataset
  • May produce inconsistent or incomplete responses
  • Not fully aligned for safety or factual accuracy
  • May still show verbosity or repetition

Example

  • Input: What is gravity?
  • Output: Gravity is a fundamental force that attracts objects with mass toward each other.

Model Type

  • Architecture: Causal Language Model
  • Task: Text Generation (Chat-style)
  • Fine-tuning: LoRA Adapter

License

This model is released under the CC BY-NC 4.0 license.

Downloads last month
91
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for Neural-Hacker/Qwen3.5-2B-chat

Finetuned
Qwen/Qwen3.5-2B
Adapter
(34)
this model

Dataset used to train Neural-Hacker/Qwen3.5-2B-chat