Instructions to use v000000/NM-12B-Lyris-dev-2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use v000000/NM-12B-Lyris-dev-2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="v000000/NM-12B-Lyris-dev-2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("v000000/NM-12B-Lyris-dev-2") model = AutoModelForCausalLM.from_pretrained("v000000/NM-12B-Lyris-dev-2") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use v000000/NM-12B-Lyris-dev-2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "v000000/NM-12B-Lyris-dev-2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "v000000/NM-12B-Lyris-dev-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/v000000/NM-12B-Lyris-dev-2
- SGLang
How to use v000000/NM-12B-Lyris-dev-2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "v000000/NM-12B-Lyris-dev-2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "v000000/NM-12B-Lyris-dev-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "v000000/NM-12B-Lyris-dev-2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "v000000/NM-12B-Lyris-dev-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use v000000/NM-12B-Lyris-dev-2 with Docker Model Runner:
docker model run hf.co/v000000/NM-12B-Lyris-dev-2
Lyris-dev2-Mistral-Nemo-12B-2407
attempt to fix Sao10k's Lyra-V3 prompt format and stop token >and boost smarts. with strategic LATCOS vector similarity merging
prototype, unfinished but works? Sometimes it does go on forever but it's way more useable, seems to have learnt to output stop token most of the time. But it's still pretty borked especially if greeting message is long. It needs even more Nemo-Instruct-2407 merged in.
- Sao10K/MN-12B-Lyra-v1 Base
- Sao10K/MN-12B-Lyra-v3 x2 Sequential PASS, order: 1, 3
- unsloth/Mistral-Nemo-Instruct-2407 x1 Single PASS, order: 2
- with z0.0001 value
Prompt format:
Mistral Instruct
[INST] System Message [/INST]
[INST] Name: Let's get started. Please respond based on the information and instructions provided above. [/INST]
<s>[INST] Name: What is your favourite condiment? [/INST]
AssistantName: Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>
[INST] Name: Do you have mayonnaise recipes? [/INST]
- Downloads last month
- 2
Model tree for v000000/NM-12B-Lyris-dev-2
Merge model
this model
