Instructions to use LLaMAX/GlotMAX-17-14B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LLaMAX/GlotMAX-17-14B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LLaMAX/GlotMAX-17-14B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("LLaMAX/GlotMAX-17-14B") model = AutoModelForCausalLM.from_pretrained("LLaMAX/GlotMAX-17-14B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use LLaMAX/GlotMAX-17-14B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LLaMAX/GlotMAX-17-14B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LLaMAX/GlotMAX-17-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/LLaMAX/GlotMAX-17-14B
- SGLang
How to use LLaMAX/GlotMAX-17-14B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LLaMAX/GlotMAX-17-14B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LLaMAX/GlotMAX-17-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LLaMAX/GlotMAX-17-14B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LLaMAX/GlotMAX-17-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use LLaMAX/GlotMAX-17-14B with Docker Model Runner:
docker model run hf.co/LLaMAX/GlotMAX-17-14B
Model Sources
- Paper: LLaMAX2: Your Translation-Enhanced Model Also Performs Well in Reasoning
- Link: https://arxiv.org/pdf/2510.09189
- Repository: https://github.com/CONE-MT/LLaMAX2.0
Model Description
GlotMAX series models start from Qwen3 instruct models with layer-slective tuning using small amount of parallel data alone.
Meanwhile, comprehensive testing on 16 reasoning tasks, such as bbeh, Livecodebench, Olymmath and so on, shows that it surpasses existing translation-enhanced models and performs on par with Qwen3 instruct models.
🔥 Excellent Translation Performance
Qwen3-XPlus significantly boost translation performance in both high- and low-resource languages.
🔥 Excellent Reasoning Performance
Trained Data Covered Languages
- en (English)
- ar (Arabic)
- bn (Bengali)
- cs (Czech)
- de (German)
- es (Spanish)
- fr (French)
- hu (Hungarian)
- ja (Japanese)
- ko (Korean)
- ru (Russian)
- sr (Serbian)
- sw (Swahili)
- te (Telugu)
- th (Thai)
- vi (Vietnamese)
- zh (Chinese)
Model Index
We implement multiple versions of the Qwen3-XPlus model, the model links are as follows:
Citation
If our model helps your work, please cite this paper:
@misc{gaoLLaMAX2YourTranslationEnhanced2025,
title = {{{LLaMAX2}}: {{Your Translation-Enhanced Model}} Also {{Performs Well}} in {{Reasoning}}},
shorttitle = {{{LLaMAX2}}},
author = {Gao, Changjiang and Huang, Zixian and Gong, Jingyang and Huang, Shujian and Li, Lei and Yuan, Fei},
year = {2025},
month = oct,
number = {arXiv:2510.09189},
eprint = {2510.09189},
primaryclass = {cs},
publisher = {arXiv},
doi = {10.48550/arXiv.2510.09189},
archiveprefix = {arXiv}
}
- Downloads last month
- 12

