Instructions to use TwinDoc/RedWhale-tv-10.8B-sft-k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TwinDoc/RedWhale-tv-10.8B-sft-k with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TwinDoc/RedWhale-tv-10.8B-sft-k")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TwinDoc/RedWhale-tv-10.8B-sft-k")
model = AutoModelForCausalLM.from_pretrained("TwinDoc/RedWhale-tv-10.8B-sft-k")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TwinDoc/RedWhale-tv-10.8B-sft-k with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TwinDoc/RedWhale-tv-10.8B-sft-k"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TwinDoc/RedWhale-tv-10.8B-sft-k",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/TwinDoc/RedWhale-tv-10.8B-sft-k

SGLang

How to use TwinDoc/RedWhale-tv-10.8B-sft-k with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TwinDoc/RedWhale-tv-10.8B-sft-k" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TwinDoc/RedWhale-tv-10.8B-sft-k",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TwinDoc/RedWhale-tv-10.8B-sft-k" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TwinDoc/RedWhale-tv-10.8B-sft-k",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use TwinDoc/RedWhale-tv-10.8B-sft-k with Docker Model Runner:
```
docker model run hf.co/TwinDoc/RedWhale-tv-10.8B-sft-k
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Description

K-S 고객사 프로젝트 시 생성한 RAG 데이터셋을 활용하여 Supervised Fine-Tuning(a.k.a SFT) 학습한 모델입니다. 학습 데이터셋은 보안에 의해 공개하지 않습니다.

About the Model

Name: TwinDoc/RedWhale-tv-10.8B-sft-k
Finetuned from model: TwinDoc/RedWhale-tv-10.8B-v1.0
Train Datasets: private
Developed by: 애자일소다 (AGILESODA)
Model type: llama
Language(s) (NLP): 한국어
License: cc-by-nc-sa-4.0
train setting
- Lora r, alpha : 32, 32
- Dtype : bf16
- Epoch : 5
- Learning rate : 1e-5
- Global batch : 1
- Context length : 4096
inference setting
- BOS id : 1
- EOS id : 2
- Top-p : 0.95
- Temperature : 0.01

prompt template

### User: 당신은 인공지능 비서입니다. 사용자가 여러분에게 과제를 줍니다. 당신의 목표는 가능한 한 충실하게 작업을 완료하는 것입니다. 작업을 수행하는 동안 단계별로 생각하고 단계를 정당화하세요. User의 질문이 주어지면 고품질의 답변을 만들어주세요.
원문: {CONTEXT}
질문: 원문을 참고하여 답변하세요. {QUESTION}
 ### Assistant: {ANSWER}

License

The content of this project, created by AGILESODA, is licensed under the Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Citation

@misc{vo2024redwhaleadaptedkoreanllm,
      title={RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining}, 
      author={Anh-Dung Vo and Minseong Jung and Wonbeen Lee and Daewoo Choi},
      year={2024},
      eprint={2408.11294},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.11294}, 
}

Built with: