databricks/databricks-dolly-15k
Viewer • Updated • 15k • 34.1k • 970
How to use minhchuxuan/llama-2.7b-dolly-lora with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("/workspace/LMOps/minillm/checkpoints/Sheared-LLaMA-2.7B-Pruned/")
model = PeftModel.from_pretrained(base_model, "minhchuxuan/llama-2.7b-dolly-lora")This is a LoRA adapter for LLaMA-2.7B, fine-tuned on the Databricks Dolly dataset for instruction-following tasks.
You need to install the required packages:
pip install transformers peft torch
Then load and use the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model (replace with actual 2.7B base model)
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-hf", # Update to 2.7B base if available
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly"
)
# Optional: Merge adapter for faster inference
# model = model.merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly")
# Generate
prompt = "Instruction: Write a short poem about AI.\n\nResponse:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_length=256,
temperature=0.7,
top_p=0.9,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
This model was fine-tuned using:
@inproceedings{lora,
title={LoRA: Low-Rank Adaptation of Large Language Models},
author={Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
booktitle={International Conference on Learning Representations},
year={2022}
}
This model is released under Apache 2.0 license. Note that LLaMA models have specific usage terms from Meta.
Base model
meta-llama/Llama-2-7b-hf