Instructions to use jdecim/SFT_202212-earnings-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use jdecim/SFT_202212-earnings-sft with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Diamegs/PIT-4B-FT-202212") model = PeftModel.from_pretrained(base_model, "jdecim/SFT_202212-earnings-sft") - Transformers
How to use jdecim/SFT_202212-earnings-sft with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jdecim/SFT_202212-earnings-sft")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("jdecim/SFT_202212-earnings-sft", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jdecim/SFT_202212-earnings-sft with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jdecim/SFT_202212-earnings-sft" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jdecim/SFT_202212-earnings-sft", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jdecim/SFT_202212-earnings-sft
- SGLang
How to use jdecim/SFT_202212-earnings-sft with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jdecim/SFT_202212-earnings-sft" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jdecim/SFT_202212-earnings-sft", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jdecim/SFT_202212-earnings-sft" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jdecim/SFT_202212-earnings-sft", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jdecim/SFT_202212-earnings-sft with Docker Model Runner:
docker model run hf.co/jdecim/SFT_202212-earnings-sft
SFT_202212-earnings-sft
LoRA adapter fine-tuned from Diamegs/PIT-4B-FT-202212 on synthetic and natural QA pairs derived from S&P-500 earnings-call transcripts. Built under point-in-time (PIT) discipline: the base model is pretrained on a chronologically-filtered FineWeb snapshot ending December 2022, and the SFT corpus is restricted to transcripts dated on or before the same cutoff, so no future information leaks into training.
Model Details
- Base model:
Diamegs/PIT-4B-FT-202212(4B params, decoder-only, 2048-token context, instruction-tuned PIT checkpoint) - Adapter type: LoRA (rank 16, alpha 32, dropout 0.05)
- Target modules: auto (attention + MLP projections)
- Trainable parameters:
23 M (0.6 % of base) - Precision: bf16
- License: academic use; inherits any constraints of the base model
Training Data
Four QA buckets assembled from ~35 K earnings-call transcripts (≤ 2022-12), all chronologically filtered to respect PIT discipline:
| Bucket | Context | Question | Answer |
|---|---|---|---|
| Forward synthetic | Full prepared remarks (anonymized for generation) | Qwen2.5-32B-generated | Extracted evidence span |
| Forward natural | Embedding-selected paragraphs (BGE-base-en-v1.5) | Analyst question (verbatim) | Management response (echo-stripped) |
| Inverse natural | Management response | Template rotation | Analyst question (verbatim) |
| Unanswerable | Prepared remarks (anonymized) | Grounded but unanswerable | Fixed refusal |
The generator LLM (Qwen2.5-32B-Instruct-GPTQ-Int4) is used only to surface questions and evidence spans on anonymized inputs; it never produces answer text. This keeps the generator's lookahead bias out of the supervised signal.
Training Procedure
- Framework:
transformers+peft+trl.SFTTrainer - Render format: PIT chat template (
<|user|>\n...<|end|>\n<|assistant|>\n...<|end|>) - Max sequence length: 2048 (hard-filtered upstream by
verify_and_filter.py) - Optimizer: AdamW, lr 2e-4, cosine schedule, warmup ratio 0.03
- Batch: per-device 1, gradient accumulation 16, effective batch 16
- Epochs: 1
- Hardware: 1× A100 40 GB (Run:AI cluster)
Full hyperparameter list: run_config.json in this repo. Per-step training metrics: sft.metrics.csv.
Limitations
- Adapter only — must be loaded onto the exact base
Diamegs/PIT-4B-FT-202212. - 2048-token context inherited from base. Long transcripts require truncation or chunking.
- Trained on US-listed S&P-500 earnings calls only; generalization to other domains is untested.
How to Use
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
repo = "jdecim/SFT_202212-earnings-sft"
tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoPeftModelForCausalLM.from_pretrained(
repo, trust_remote_code=True, torch_dtype="bfloat16",
)
model.eval()
prompt = "<|user|>\nWhat did management say about Q3 margins?\n<|end|>\n<|assistant|>\n"
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256)
print(tok.decode(out[0], skip_special_tokens=False))
Companion Model
jdecim/SFT_202112-earnings-sft— same recipe, baseDiamegs/PIT-4B-FT-202112(2021-12 cutoff)
Framework versions
- PEFT 0.19.1
- Downloads last month
- 76
Model tree for jdecim/SFT_202212-earnings-sft
Base model
Diamegs/PIT-4B-FT-202212