QuantFactory/Turkish-Llama-8b-Instruct-v0.1-GGUF

This is quantized version of ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1 created suign llama.cpp

Model Description

This model is a fully fine-tuned version of the "meta-llama/Meta-Llama-3-8B-Instruct" model with a 30GB Turkish dataset.

The Cosmos LLaMa Instruct is designed for text generation tasks, providing the ability to continue a given text snippet in a coherent and contextually relevant manner. Due to the diverse nature of the training data, which includes websites, books, and other text sources, this model can exhibit biases. Users should be aware of these biases and use the model responsibly.

Transformers pipeline

import transformers
import torch

model_id = "ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Sen bir yapay zeka asistanฤฑsฤฑn. Kullanฤฑcฤฑ sana bir gรถrev verecek. Amacฤฑn gรถrevi olabildiฤŸince sadฤฑk bir ลŸekilde tamamlamak. Gรถrevi yerine getirirken adฤฑm adฤฑm dรผลŸรผn ve adฤฑmlarฤฑnฤฑ gerekรงelendir."},
    {"role": "user", "content": "Soru: Bir arabanฤฑn deposu 60 litre benzin alabiliyor. Araba her 100 kilometrede 8 litre benzin tรผketiyor. Depo tamamen doluyken araba kaรง kilometre yol alabilir?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])

Transformers AutoModelForCausalLM

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Sen bir yapay zeka asistanฤฑsฤฑn. Kullanฤฑcฤฑ sana bir gรถrev verecek. Amacฤฑn gรถrevi olabildiฤŸince sadฤฑk bir ลŸekilde tamamlamak. Gรถrevi yerine getirirken adฤฑm adฤฑm dรผลŸรผn ve adฤฑmlarฤฑnฤฑ gerekรงelendir."},
    {"role": "user", "content": "Soru: Bir arabanฤฑn deposu 60 litre benzin alabiliyor. Araba her 100 kilometrede 8 litre benzin tรผketiyor. Depo tamamen doluyken araba kaรง kilometre yol alabilir?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Model Contact

COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department
https://cosmos.yildiz.edu.tr/
cosmos@yildiz.edu.tr


license: llama3

Downloads last month
244
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for QuantFactory/Turkish-Llama-8b-Instruct-v0.1-GGUF