Swahili Gemma 1B - GGUF

Quantized GGUF versions of Swahili Gemma 1B, a fine-tuned Gemma 3 1B instruction model specialized for English-to-Swahili translation and Swahili conversational AI. The model accepts input in both English and Swahili but outputs responses exclusively in Swahili.

📊 Translation Performance

Translation Performance Comparison

Model Comparison

Model Parameters BLEU chrF++ Efficiency*
Gemma 3 4B 4B 10.9 44.1 2.7
Swahili Gemma 1B 1B 27.6 56.8 27.6
Gemma 3 27B 27B 29.4 60.0 1.1
GPT-5 Mini ~8B 31.8 62.4 4.0
Gemini 2.0 Flash Large 35.6 64.6 N/A

*Efficiency = BLEU Score / Parameters (in billions)

Key Performance Insights

🎯 Efficiency Leader: Achieves the highest BLEU-to-parameter ratio (27.6 BLEU per billion parameters)
🚀 Size Advantage: Outperforms Gemma 3 4B (4x larger) by 153% on BLEU score
💎 Competitive Quality: Achieves 94% of Gemma 3 27B performance with 27x fewer parameters
Practical Deployment: Runs efficiently on consumer hardware while maintaining quality

Evaluation Details

  • Dataset: FLORES-200 English→Swahili (1,012 translation pairs)
  • Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
  • Evaluation: Zero-shot translation performance

🚀 Quick Start

# Download the recommended Q4_K_M quantization
pip install huggingface_hub

# Python download
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="CraneAILabs/swahili-gemma-1b-GGUF",
    local_dir="swahili-gemma-1b-GGUF",
    allow_patterns=["Q4_K_M/*"]  # Download only Q4_K_M version
)

📊 Available Quantizations

Quantization Folder File Size Quality Use Case
F32 F32/ ~3.8GB Highest Research & benchmarking
F16 F16/ ~1.9GB Highest Maximum quality inference
Q8_0 Q8_0/ ~1.0GB Very High Production with ample resources
Q5_K_M Q5_K_M/ ~812MB High Balanced quality/size
Q4_K_M Q4_K_M/ ~769MB Good Recommended for most users
Q4_K_S Q4_K_S/ ~745MB Good Resource-constrained environments
Q3_K_M Q3_K_M/ ~689MB Fair Mobile/edge deployment
Q2_K Q2_K/ ~658MB Lower Minimal resource usage

💻 Usage with llama.cpp

Basic Translation

# English to Swahili translation
./llama-cli \
  --model swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf \
  --prompt "Translate to Swahili: Hello, how are you today?" \
  --temp 0.3 \
  --top-p 0.95 \
  --top-k 64 \
  --repeat-penalty 1.1 \
  -n 128

🔧 Usage with Ollama

# Create model from GGUF
ollama create swahili-gemma-1b -f Modelfile

# Use for translation
ollama run swahili-gemma-1b "Translate to Swahili: Good morning!"

# Use for conversation  
ollama run swahili-gemma-1b "Hujambo! Je, unaweza kunisaidia?"

Modelfile Example

FROM swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

🐍 Usage with Python (llama-cpp-python)

from llama_cpp import Llama

# Initialize model
llm = Llama(
    model_path="swahili-gemma-1b-GGUF/Q4_K_M/swahili-gemma-1b-q4_k_m.gguf",
    n_ctx=2048,
    n_threads=8,
    verbose=False
)

# Generate translation
response = llm(
    "Translate to Swahili: Hello, how are you today?",
    max_tokens=128,
    temperature=0.3,
    top_p=0.95,
    top_k=64,
    repeat_penalty=1.1
)

print(response['choices'][0]['text'])

🌍 Language Capabilities

  • Input Languages: English + Swahili
  • Output Language: Swahili only
  • Primary Focus: English-to-Swahili translation and Swahili conversation

📊 Performance Metrics

Translation Quality (BLEU Scores)

Model BLEU Score chrF++
🥇 Swahili Gemma 1B 23.64 52.26
🥈 ChatGPT-4o-latest [TBD] [TBD]
🥉 Other Models [TBD] [TBD]

Evaluated on 1,012 English-to-Swahili translation samples.

🎯 Capabilities

  • Translation: English-to-Swahili translation
  • Conversational AI: Natural dialogue in Swahili
  • Summarization: Text summarization in Swahili
  • Writing: Creative and informational writing in Swahili
  • Question Answering: General knowledge responses in Swahili

💡 Recommended Parameters

# Optimal settings for translation tasks
--temp 0.3
--top-p 0.95
--top-k 64
--repeat-penalty 1.1
--ctx-size 2048

🔗 Related Models

🛠️ Technical Details

  • Base Model: google/gemma-3-1b-it
  • Architecture: Gemma 3
  • Context Length: 4,096 tokens
  • Quantization: GGML format with multiple precision levels
  • Compatible: llama.cpp, Ollama, Jan, LM Studio, and other GGUF engines

🎨 Use Cases

  • Offline Translation: Run Swahili translation without internet
  • Local AI Assistant: Swahili conversational AI on your machine
  • Educational Tools: Language learning applications
  • Content Creation: Generate Swahili content locally
  • Research: Swahili language model experiments

⚠️ Limitations

  • Language Output: Responds only in Swahili
  • Quantization Trade-offs: Lower bit quantizations may reduce quality
  • Context Limit: 4K tokens for optimal performance
  • Specialized Tasks: May need fine-tuning for specific domains

📄 License

This model is released under the Gemma Terms of Use. Please review the terms before use.

🙏 Acknowledgments

  • Google: For the Gemma 3 base model, support and guidance.
  • Community: For Swahili language resources and datasets
  • Gilbert Korir (Msingi AI, Nairobi, Kenya)
  • Alfred Malengo Kondoro (Hanyang University, Seoul, South Korea)

Citation

If you use these GGUF quantizations in your research or applications, please cite:

@misc{crane_ai_labs_2025,
    author    = {Bakunga Bronson and Kato Steven Mubiru and Lwanga Caleb and Gimei Alex and Kavuma Lameck and Roland Ganafa and Sibomana Glorry and Atuhaire Collins and JohnRoy Nangeso and Tukamushaba Catherine},
    title     = {Swahili Gemma: A Fine-tuned Gemma 3 1B Model for Swahili conversational AI},
    year      = {2025},
    url       = {https://huggingface.co/CraneAILabs/swahili-gemma-1b},
    organization = {Crane AI Labs}
}

Built with ❤️ by Crane AI Labs

Swahili Gemma - Your helpful Swahili AI companion, optimized for local deployment

Downloads last month
47
GGUF
Model size
1.0B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CraneAILabs/swahili-gemma-1b-GGUF

Quantized
(4)
this model

Collection including CraneAILabs/swahili-gemma-1b-GGUF