Instructions to use jdecim/SFT_202212-earnings-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jdecim/SFT_202212-earnings-sft with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Diamegs/PIT-4B-FT-202212")
model = PeftModel.from_pretrained(base_model, "jdecim/SFT_202212-earnings-sft")

Transformers

How to use jdecim/SFT_202212-earnings-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jdecim/SFT_202212-earnings-sft")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("jdecim/SFT_202212-earnings-sft", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use jdecim/SFT_202212-earnings-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jdecim/SFT_202212-earnings-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jdecim/SFT_202212-earnings-sft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/jdecim/SFT_202212-earnings-sft

SGLang

How to use jdecim/SFT_202212-earnings-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jdecim/SFT_202212-earnings-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jdecim/SFT_202212-earnings-sft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jdecim/SFT_202212-earnings-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jdecim/SFT_202212-earnings-sft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use jdecim/SFT_202212-earnings-sft with Docker Model Runner:
```
docker model run hf.co/jdecim/SFT_202212-earnings-sft
```

SFT_202212-earnings-sft

LoRA adapter fine-tuned from Diamegs/PIT-4B-FT-202212 on synthetic and natural QA pairs derived from S&P-500 earnings-call transcripts. Built under point-in-time (PIT) discipline: the base model is pretrained on a chronologically-filtered FineWeb snapshot ending December 2022, and the SFT corpus is restricted to transcripts dated on or before the same cutoff, so no future information leaks into training.

Model Details

Base model: Diamegs/PIT-4B-FT-202212 (4B params, decoder-only, 2048-token context, instruction-tuned PIT checkpoint)
Adapter type: LoRA (rank 16, alpha 32, dropout 0.05)
Target modules: auto (attention + MLP projections)
Trainable parameters: ~~23 M (~~0.6 % of base)
Precision: bf16
License: academic use; inherits any constraints of the base model

Training Data

Four QA buckets assembled from ~35 K earnings-call transcripts (≤ 2022-12), all chronologically filtered to respect PIT discipline:

Bucket	Context	Question	Answer
Forward synthetic	Full prepared remarks (anonymized for generation)	Qwen2.5-32B-generated	Extracted evidence span
Forward natural	Embedding-selected paragraphs (BGE-base-en-v1.5)	Analyst question (verbatim)	Management response (echo-stripped)
Inverse natural	Management response	Template rotation	Analyst question (verbatim)
Unanswerable	Prepared remarks (anonymized)	Grounded but unanswerable	Fixed refusal

The generator LLM (Qwen2.5-32B-Instruct-GPTQ-Int4) is used only to surface questions and evidence spans on anonymized inputs; it never produces answer text. This keeps the generator's lookahead bias out of the supervised signal.

Training Procedure

Framework: transformers + peft + trl.SFTTrainer
Render format: PIT chat template (<|user|>\n...<|end|>\n<|assistant|>\n...<|end|>)
Max sequence length: 2048 (hard-filtered upstream by verify_and_filter.py)
Optimizer: AdamW, lr 2e-4, cosine schedule, warmup ratio 0.03
Batch: per-device 1, gradient accumulation 16, effective batch 16
Epochs: 1
Hardware: 1× A100 40 GB (Run:AI cluster)

Full hyperparameter list: run_config.json in this repo. Per-step training metrics: sft.metrics.csv.

Limitations

Adapter only — must be loaded onto the exact base Diamegs/PIT-4B-FT-202212.
2048-token context inherited from base. Long transcripts require truncation or chunking.
Trained on US-listed S&P-500 earnings calls only; generalization to other domains is untested.

How to Use

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

repo = "jdecim/SFT_202212-earnings-sft"

tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoPeftModelForCausalLM.from_pretrained(
    repo, trust_remote_code=True, torch_dtype="bfloat16",
)
model.eval()

prompt = "<|user|>\nWhat did management say about Q3 margins?\n<|end|>\n<|assistant|>\n"
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256)
print(tok.decode(out[0], skip_special_tokens=False))

Companion Model

jdecim/SFT_202112-earnings-sft — same recipe, base Diamegs/PIT-4B-FT-202112 (2021-12 cutoff)

Framework versions

PEFT 0.19.1

Downloads last month: 76

Model tree for jdecim/SFT_202212-earnings-sft

Base model

Diamegs/PIT-4B-FT-202212

Adapter

(3)

this model