Instructions to use jdecim/SFT_202112-earnings-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use jdecim/SFT_202112-earnings-sft with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Diamegs/PIT-4B-FT-202112") model = PeftModel.from_pretrained(base_model, "jdecim/SFT_202112-earnings-sft") - Notebooks
- Google Colab
- Kaggle
SFT_202112 β LoRA adapter for PIT-4B-FT on earnings-call QA
LoRA adapter trained on top of Diamegs/PIT-4B-FT-202112. Fine-tuned on jdecim/pit-earnings-call-qa for question answering over US earnings-call transcripts, respecting PIT chronological discipline.
Training
- Base:
Diamegs/PIT-4B-FT-202112 - Method: LoRA via π€
peft- r = 16, Ξ± = 32, dropout = 0.05
- target_modules =
auto
- label_shift_mode:
auto(PIT models pre-shift labels; do NOT usemodelmode β it causes identity collapse) - render_format:
pit(PIT chat template:<|user|>/<|assistant|>/<|end|>) - Epochs: 1.0
- Approx steps: 11046
- Effective batch size: 16
- LR: 2e-4 (cosine, warmup_ratio=0.03)
- Final eval loss: 0.3224
Usage
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained(
"Diamegs/PIT-4B-FT-202112",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
tok = AutoTokenizer.from_pretrained("Diamegs/PIT-4B-FT-202112", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "jdecim/SFT_202112-earnings-sft").to("cuda").eval()
prompt = "<|user|>\nQuestion: What was Q4 net revenue?\nContext: β¦\n<|assistant|>\n"
ids = tok(prompt, return_tensors="pt").to("cuda")
out = model.generate(**ids, max_new_tokens=256, do_sample=False,
eos_token_id=tok.encode("<|end|>", add_special_tokens=False)[-1:])
print(tok.decode(out[0, ids.input_ids.shape[1]:], skip_special_tokens=False))
Dataset
jdecim/pit-earnings-call-qa β see that page for the four QA buckets, split sizes, and PIT discipline details. This adapter was trained on the snapshot matching Diamegs/PIT-4B-FT-202112:
from datasets import load_dataset
snapshot = "202112" # the trailing YYYYMM in the base model name
ds = load_dataset("jdecim/pit-earnings-call-qa", snapshot, split="train")
License
MIT.
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for jdecim/SFT_202112-earnings-sft
Base model
Diamegs/PIT-4B-FT-202112