How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="BeaverAI/Artemis-31B-v1a-GGUF",
	filename="Artemis-31B-v1a-Q8_0.gguf",
)
output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

  • gemma 4 31b
  • accidentally trained on base. still works
  • 10% of old dataset
  • only q8, need to save on storage
  • FORGOT TO TRAIN THE EOT TOKEN, will sometimes yap endlessly
  • (Note to self)NEED TO PREPEND THE THINK PLACEHOLDER NEXT TIME

uses this template

Screenshot 2026-04-09 at 6.31.08 AM

Downloads last month
659
GGUF
Model size
31B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support