Instructions to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF", filename="Mistral-Nemo-Japanese-Instruct-2408-Q2_K.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K # Run inference directly in the terminal: llama-cli -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K # Run inference directly in the terminal: llama-cli -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K # Run inference directly in the terminal: ./llama-cli -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
Use Docker
docker model run hf.co/tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
- LM Studio
- Jan
- vLLM
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
- Ollama
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with Ollama:
ollama run hf.co/tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
- Unsloth Studio new
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF to start chatting
- Docker Model Runner
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with Docker Model Runner:
docker model run hf.co/tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
- Lemonade
How to use tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull tensorblock/Mistral-Nemo-Japanese-Instruct-2408-GGUF:Q2_K
Run and chat with the model
lemonade run user.Mistral-Nemo-Japanese-Instruct-2408-GGUF-Q2_K
List all available models
lemonade list
Remove .gguf files (keep Q2_K.gguf)
Browse files- Mistral-Nemo-Japanese-Instruct-2408-Q3_K_L.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q3_K_M.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q3_K_S.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q4_0.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q4_K_M.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q4_K_S.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q5_0.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q5_K_M.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q5_K_S.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q6_K.gguf +0 -3
- Mistral-Nemo-Japanese-Instruct-2408-Q8_0.gguf +0 -3
Mistral-Nemo-Japanese-Instruct-2408-Q3_K_L.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:4f30929997f22bea2f61f04de75f9dd5b0d994912448441a8ac1d96c395c40e7
|
| 3 |
-
size 6561515456
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q3_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:7db67ff02c10b49085c427a730f8f338bbcd08188a40086024e92fa541c8738c
|
| 3 |
-
size 6083102656
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q3_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:3019d3eb6a0f10642a6a9dede675d4cd9e719a532d43f920dfae397acf22dc7a
|
| 3 |
-
size 5534238656
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q4_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:77aa73387175c6e83b1ed5a55d7ea7dbd70ef8f385cf2ea7b2ef15c5b40e80ba
|
| 3 |
-
size 7071714560
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q4_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:812cfb8d8af3797fbe40460afd8a723d72c7814c6165ff88598f4bed6e773305
|
| 3 |
-
size 7477218560
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q4_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:66c4ba39967830eb649d9284017b4cbec40ef8a8d87ca650923ffcb723ba9f68
|
| 3 |
-
size 7120211200
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q5_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:d352f291b756943294018d5111b42c5e4278e52e36c05599ea916ab394302959
|
| 3 |
-
size 8518750720
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q5_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:d1532de05956f0345019c65303134e9fd40cba967e582d1b1c6acae33b754e73
|
| 3 |
-
size 8727646720
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q5_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:fb33d8858d9de033ad2f81d9350cb7dce38bc1a9265947c8670351b2c543041b
|
| 3 |
-
size 8518750720
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q6_K.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:87ec043377bf840664b9547116b444d0082c184ea86e23504e5f863fc78ff50a
|
| 3 |
-
size 10056226656
|
|
|
|
|
|
|
|
|
|
|
|
Mistral-Nemo-Japanese-Instruct-2408-Q8_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:5c8243e32e4e632f157d3b80d53bbba14286c64eae82a3ab49429b0ab6aec099
|
| 3 |
-
size 13022390944
|
|
|
|
|
|
|
|
|
|
|
|