Feature Extraction
Transformers
Safetensors
sentence-transformers
English
Chinese
c2llm
code
custom_code
Instructions to use codefuse-ai/C2LLM-0.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use codefuse-ai/C2LLM-0.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="codefuse-ai/C2LLM-0.5B", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("codefuse-ai/C2LLM-0.5B", trust_remote_code=True, dtype="auto") - sentence-transformers
How to use codefuse-ai/C2LLM-0.5B with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("codefuse-ai/C2LLM-0.5B", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Qwen or not?
#1
by clover-supply - opened
The model page says it's based on Qwen but when I try to gguf the model it says the architecture is not supported? Is it too much differrent now?
Yes, it's based on Qwen2.5. However, as described in the technical report, we apply a PMA layer on top of the model, so you will need to load with trust_remote_code=True.
How many input tokens and how many embeddings dimensions please?
C2LLM-0.5B has an embedding dimension of 896, and C2LLM-7B has an embedding dimension of 3584. Both models support 8192 input tokens.