TTC-L2V-2 (Danish, Swedish and Norwegian)

Model Description

Supervised model for sentence embeddings.

  • Developed by: Jesper Alkestrup, The Tech Collective
  • Model type: Embedding model
  • Language(s) (NLP): Danish, Swedish and Norwegian
  • Finetuned from model: AI-Sweden-Models/Llama-3-8B-instruct
  • Finetuning procedure: LLM2Vec

Trained by using the approach outlined in the paper LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders.

Usage

This is a sentence-transformers model — it loads directly and needs no packages beyond sentence-transformers. The bidirectional Llama encoder and the LLM2Vec instruction-aware mean pooling are provided by small custom modules in this repo, loaded via trust_remote_code.

pip install sentence-transformers
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("jealk/TTC-L2V-supervised-2", trust_remote_code=True)

# Queries are encoded with an instruction; documents without one.
instruction = "Givet et spørgsmål, find relevante tekstudsnit, der besvarer det:"
queries = [
    "Hvordan påvirker søvn vores koncentrationsevne",
    "Hvad skal man være opmærksom på, når man køber brugt cykel",
]
documents = [
    "Forskning viser, at for lidt søvn kan nedsætte både koncentration og hukommelse. "
    "Allerede efter én nat med dårlig søvn kan man opleve problemer med at fokusere.",
    "Når du køber en brugt cykel, bør du tjekke, om stellet har skader eller rust, og "
    "om gear og bremser fungerer korrekt.",
]

q_emb = model.encode(queries, prompt=instruction)   # or prompt_name="query"
d_emb = model.encode(documents)

# Cosine similarity
print(model.similarity(q_emb, d_emb))

The instruction is also registered as a named prompt, so model.encode(queries, prompt_name="query") applies the default Danish instruction shown above. Documents are encoded without an instruction.

Pooling is mean pooling over the content tokens; the instruction and the beginning-of-text token are excluded — exactly as in LLM2Vec.

Model size and quantization

The model has ~8B parameters and loads in roughly 14 GB (bfloat16). To run on smaller GPUs it can be loaded quantized via bitsandbytes:

from sentence_transformers import SentenceTransformer
from transformers import BitsAndBytesConfig

model = SentenceTransformer(
    "jealk/TTC-L2V-supervised-2", trust_remote_code=True,
    model_kwargs={"quantization_config": BitsAndBytesConfig(load_in_8bit=True)},
)

Approximate memory footprint: bfloat16 ≈ 14 GB, 8-bit ≈ 8 GB, 4-bit ≈ 5 GB (load_in_4bit=True).

The quantized variants have not been evaluated on benchmarks — embedding quality may differ from the full-precision model.

Notes

  • trust_remote_code=True is required: it loads the bidirectional Llama encoder and the LLM2Vec-style pooling module shipped in this repo.

Credits

Approach from LLM2Vec (McGill-NLP). Related model: https://huggingface.co/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised

Downloads last month
110
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jealk/TTC-L2V-supervised-2

Finetuned
(7)
this model
Finetunes
2 models

Datasets used to train jealk/TTC-L2V-supervised-2