Instructions to use Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation")
model = AutoModelForCausalLM.from_pretrained("Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation

SGLang

How to use Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation with Docker Model Runner:
```
docker model run hf.co/Gemstone-Models-LR-Ablation/Gemstone-256x27_lr_ablation
```

Gemstone-256x27_lr_ablation

Gemstone-256x27_lr_ablation is part of the Gemstone Suite of Models. A set of models trained with varying widths and depths. This particular version, denoted by the _lr_ablation postfix, corresponds to an ablation detailed in the paper where we train the same suite of models but with a learning rate that is half of the original.

Training

We train using litgpt and AxoNN using AMD MI250X GPUs on Frontier at Oak Ridge National Laboratory with a global batch size of 2048.

Data

Train and validation data is taken from non-overlapping subsets of dolma. As such it is not an instruction model. This model is trained for 100 billion tokens (in contrast to the main suite, which is trained to 350 billion tokens), we upload checkpoints every 2 billion tokens (477 steps).

Using Gemstone-256x27_lr_ablation

The Gemstones are based on the gemma-2b architecture and use modeling_gemma.py to run using the transformers library.

Licence

This model is released under the apache-2.0 licence.

Contact

Please, feel free to contact us with any questions, or open a discussion thread.

Citation

@article{mcleish2024gemstones
    title={Gemstones: A Model Suite for Multi-Faceted Scaling Laws}, 
    author={Sean McLeish and John Kirchenbauer and David Yu Miller and Siddharth Singh and Abhinav Bhatele and Micah Goldblum and Ashwinee Panda and Tom Goldstein},
    journal={arXiv preprint arXiv:2502.},
    year={2025}
}

Downloads last month: 2

Safetensors

Model size

52.3M params

Tensor type

F32

Gemstone-Models-LR-Ablation
/

Gemstone-256x27_lr_ablation