Instructions to use LLaMAX/LLaMAX2-7B-MetaMath with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use LLaMAX/LLaMAX2-7B-MetaMath with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="LLaMAX/LLaMAX2-7B-MetaMath")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LLaMAX/LLaMAX2-7B-MetaMath")
model = AutoModelForCausalLM.from_pretrained("LLaMAX/LLaMAX2-7B-MetaMath")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use LLaMAX/LLaMAX2-7B-MetaMath with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LLaMAX/LLaMAX2-7B-MetaMath"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX2-7B-MetaMath",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/LLaMAX/LLaMAX2-7B-MetaMath

SGLang

How to use LLaMAX/LLaMAX2-7B-MetaMath with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "LLaMAX/LLaMAX2-7B-MetaMath" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX2-7B-MetaMath",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "LLaMAX/LLaMAX2-7B-MetaMath" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX2-7B-MetaMath",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use LLaMAX/LLaMAX2-7B-MetaMath with Docker Model Runner:
```
docker model run hf.co/LLaMAX/LLaMAX2-7B-MetaMath
```

LLaMAX2-7B-MetaMath

Commit History

Update README.md

dd43886
verified

LLaMAX commited on Dec 6, 2024

Update README.md

4a4da9d
verified

LLaMAX commited on Jul 21, 2024

Update README.md

5c56f23
verified

LLaMAX commited on Jul 16, 2024

Update README.md

7cdecbc
verified

LLaMAX commited on Jul 12, 2024

Update README.md

042bc28
verified

LLaMAX commited on Jul 12, 2024

update readme

77d70d5

huangtao6 commited on Jul 9, 2024

update readme

2039528

huangtao6 commited on Jul 9, 2024

update readme

29d7d78

huangtao6 commited on Jul 9, 2024

readme

f896060

huangtao6 commited on Jul 8, 2024

First model version

79364c3

Lego-MT commited on Jul 8, 2024

initial commit

bc47650
verified

TransLLaMA commited on Jun 25, 2024

Commit History

Update README.md dd43886 verified

Update README.md 4a4da9d verified

Update README.md 5c56f23 verified

Update README.md 7cdecbc verified

Update README.md 042bc28 verified

update readme 77d70d5

update readme 2039528

update readme 29d7d78

readme f896060

First model version 79364c3

initial commit bc47650 verified

Update README.md

dd43886
verified

Update README.md

4a4da9d
verified

Update README.md

5c56f23
verified

Update README.md

7cdecbc
verified

Update README.md

042bc28
verified

update readme

77d70d5

update readme

2039528

update readme

29d7d78

readme

f896060

First model version

79364c3

initial commit

bc47650
verified