jhflow/orca_ko_en_pair
Viewer โข Updated โข 19.3k โข 79 โข 3
How to use StatPan/singung-sft-v0.1 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="StatPan/singung-sft-v0.1") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("StatPan/singung-sft-v0.1")
model = AutoModelForCausalLM.from_pretrained("StatPan/singung-sft-v0.1")How to use StatPan/singung-sft-v0.1 with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "StatPan/singung-sft-v0.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "StatPan/singung-sft-v0.1",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/StatPan/singung-sft-v0.1
How to use StatPan/singung-sft-v0.1 with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "StatPan/singung-sft-v0.1" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "StatPan/singung-sft-v0.1",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "StatPan/singung-sft-v0.1" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "StatPan/singung-sft-v0.1",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use StatPan/singung-sft-v0.1 with Docker Model Runner:
docker model run hf.co/StatPan/singung-sft-v0.1
This model was developed using the Mistral-7b base model. The name โMistralโ is derived from an anti-aircraft weapon, which inspired the creation of Koreaโs anti-aircraft weapon, Singung. Just as the anti-aircraft weapon was named after its inspiration, this model was named โSingungโ because it uses the Mistral model.
The Lora tuning technique was used and is included in the weights.
SYSTEM_PROMPT = "### System:\n ๋น์ ์ ์ฐจ๊ทผ์ฐจ๊ทผ ์๊ฐํ๊ณ , ๋
ผ๋ฆฌ์ ์ธ ์ฌ๊ณ ๋ฅผ ํตํด ์ฃผ์ด์ง ๋ฌธ์ ๋ฅผ ํธ๋ ์ธ๊ณต์ง๋ฅ ์
๋๋ค."
USER_PROMPT = "\n ### User: \n ๋ฏผ์์ด๋ ์ด์ ์ฌ๊ณผ๋ฅผ 3๊ฐ ๋ค๊ณ ์์์ต๋๋ค. ์ค๋ ๊ณ ์์ด๊ฐ ์ฌ๊ณผ๋ฅผ ๋ฌ๋ผ๊ณ ํ๋๋ฐ ์ฃผ์ง ์์์ต๋๋ค. ๋ฏผ์์ด๋ ๋์ ์ฌ๋ ์
๋๊น?"
ASSISTANT_PROMPT = "\n ### Assistant: \n"
template = f"{SYSTEM_PROMPT}{USER_PROMPT}{ASSISTANT_PROMPT}"
inputs = tokenizer(template, retuen_tensors="pt").to(0) # ๋ชจ๋ธ๊ณผ ๋์ผํ GPU์ ์ฌ์ฉํด์ผํจ. ์์ผ๋ฉด to(0) ์ง์ฐ์ธ์.
response = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.batch_decode(response)[0])
### response:
<s> ### System:
๋น์ ์ ์ฐจ๊ทผ์ฐจ๊ทผ ์๊ฐํ๊ณ , ๋
ผ๋ฆฌ์ ์ธ ์ฌ๊ณ ๋ฅผ ํตํด ์ฃผ์ด์ง ๋ฌธ์ ๋ฅผ ํธ๋ ์ธ๊ณต์ง๋ฅ ์
๋๋ค.
### User:
๋ฏผ์์ด๋ ์ด์ ์ฌ๊ณผ๋ฅผ 3๊ฐ ๋ค๊ณ ์์์ต๋๋ค. ์ค๋ ๊ณ ์์ด๊ฐ ์ฌ๊ณผ๋ฅผ ๋ฌ๋ผ๊ณ ํ๋๋ฐ ์ฃผ์ง ์์์ต๋๋ค. ๋ฏผ์์ด๋ ๋์ ์ฌ๋ ์
๋๊น?
### Assistant:
๋ฏผ์์ด๊ฐ ๊ณ ์์ด๊ฐ ์ฌ๊ณผ๋ฅผ ์ข์ํ๋ ๊ฒ์ ์๊ณ ์์์๋ ๋ถ๊ตฌํ๊ณ ์ฌ๊ณผ๋ฅผ ์ฃผ์ง ์์๊ธฐ ๋๋ฌธ์ ๊ณ ์์ด๋ฅผ ๋ถ์พํ๊ฒ ํ๊ณ ๋ถ์ํ๊ฒ ๋ง๋ค์๊ธฐ ๋๋ฌธ์ ๋ฏผ์์ด๋ ์ฝ๊ฐ ๋์ ์ฌ๋์ผ๋ก ๊ฐ์ฃผ๋ ์ ์์ต๋๋ค.
๊ทธ๋ฌ๋ ์ด๊ฒ์ด ๋จ์ง ์ฌ๊ณผ์ ๋ํ ํ๋์๋ง ๊ตญํ๋ ๊ฒ์ ์๋๋ผ๋ ์ ์ ๊ณ ๋ คํ๋ ๊ฒ์ด ์ค์ํฉ๋๋ค.
์ฌ๋๋ง๋ค ๋ค๋ฅธ ๊ฐ์น๊ด๊ณผ ์ ๋
์ ๊ฐ์ง๊ณ ์์ผ๋ฉฐ, ์ด๋ค ์ฌ๋๋ค์ ๋ฐ๋ ค๋๋ฌผ์ด ๋จน์ ๊ฒ์ ์ฃผ์ง ์๋ ๊ฒ์ ์ ๋นํํ ์ ์์ต๋๋ค.
๋ฐ๋ผ์ ๋ฏผ์์ด๊ฐ ๋์ ์ฌ๋์ด๋ผ๊ณ ๋จ์ ์ ์ผ๋ก ๋งํ๊ธฐ ์ ์ ๊ทธ๋
์ ์๋์ ์ ๋
์ ๊ณ ๋ คํ๋ ๊ฒ์ด ์ค์ํฉ๋๋ค.</s>