Instructions to use leeroy-jankins/boogr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use leeroy-jankins/boogr with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("leeroy-jankins/boogr") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - llama-cpp-python
How to use leeroy-jankins/boogr with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="leeroy-jankins/boogr", filename="boogr-small-en-v1.5-q8_0.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use leeroy-jankins/boogr with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf leeroy-jankins/boogr:Q8_0 # Run inference directly in the terminal: llama-cli -hf leeroy-jankins/boogr:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf leeroy-jankins/boogr:Q8_0 # Run inference directly in the terminal: llama-cli -hf leeroy-jankins/boogr:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf leeroy-jankins/boogr:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf leeroy-jankins/boogr:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf leeroy-jankins/boogr:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf leeroy-jankins/boogr:Q8_0
Use Docker
docker model run hf.co/leeroy-jankins/boogr:Q8_0
- LM Studio
- Jan
- Ollama
How to use leeroy-jankins/boogr with Ollama:
ollama run hf.co/leeroy-jankins/boogr:Q8_0
- Unsloth Studio new
How to use leeroy-jankins/boogr with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for leeroy-jankins/boogr to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for leeroy-jankins/boogr to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for leeroy-jankins/boogr to start chatting
- Docker Model Runner
How to use leeroy-jankins/boogr with Docker Model Runner:
docker model run hf.co/leeroy-jankins/boogr:Q8_0
- Lemonade
How to use leeroy-jankins/boogr with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull leeroy-jankins/boogr:Q8_0
Run and chat with the model
lemonade run user.boogr-Q8_0
List all available models
lemonade list
✨ Overview
Boogr is derived from BAAI's bge-small-en-v1.5, part of the BGE
(BAAI General Embedding) family.
The upstream model family is designed for dense retrieval and text embedding tasks such as:
- semantic search
- document retrieval
- chunk similarity
- passage ranking
- clustering
- sentence-level representation learning
Within Chonky, Boogr is the lightweight local English embedding option and is best suited for:
- default local installations
- offline embedding workflows
- rapid experimentation
- development and testing
- vectorizing chunked corpora on lower-resource systems
⚙️ Code Respository
🧰 Streamlit UI
🧠 Why Boogr Exists
Chonky supports both hosted and local embedding workflows. Boogr exists to give Chonky users a fully local, low-friction embedding path that avoids dependence on hosted provider APIs for common semantic-search tasks.
Boogr is especially useful when you want:
- local-only embeddings
- offline or restricted-network operation
- lower memory use than larger embedding models
- an English-first default embedder
- a model that is straightforward to distribute with the application
🔬 Base Model Lineage
Boogr is derived from:
- Upstream base model:
BAAI/bge-small-en-v1.5 - Model family: BGE / FlagEmbedding
- Primary task family: feature extraction / text embeddings
- Language focus: English
- License: MIT
The v1.5 revision of the BGE family was introduced to improve retrieval behavior and
address similarity-distribution issues observed in earlier releases.
Specs
- Model
| Model Name | Dimension | Sequence Length | Introduction |
|---|---|---|---|
| boogr | 1024 | 8192 | multilingual; unified fine-tuning (dense, sparse, and colbert) from bge-m3-unsupervised |
| BAAI/bge-m3-unsupervised | 1024 | 8192 | multilingual; contrastive learning from bge-m3-retromae |
| BAAI/bge-m3-retromae | -- | 8192 | multilingual; extend the max_length of xlm-roberta to 8192 and further pretrained via retromae |
| BAAI/bge-large-en-v1.5 | 1024 | 512 | English model |
| BAAI/bge-base-en-v1.5 | 768 | 512 | English model |
| BAAI/bge-small-en-v1.5 | 384 | 512 | English model |
- Data
| Dataset | Introduction |
|---|---|
| MLDR | Docuemtn Retrieval Dataset, covering 13 languages |
| bge-m3-data | Fine-tuning data used by bge-m3 |
FAQ
1. Introduction for different retrieval methods
- Dense retrieval: map the text into a single embedding, e.g., DPR, BGE-v1.5
- Sparse retrieval (lexical matching): a vector of size equal to the vocabulary, with the majority of positions set to zero, calculating a weight only for tokens present in the text. e.g., BM25, unicoil, and splade
- Multi-vector retrieval: use multiple vectors to represent a text, e.g., ColBERT.
2. How to use boogr in other projects?
For embedding retrieval, you can employ the same approach as BGE. The only difference is that the BGE-M3 model no longer requires adding instructions to the queries.
For hybrid retrieval, you can use Vespa and Milvus.
3. How to fine-tune boogr?
You can follow the common in this example to fine-tune the dense embedding.
If you want to fine-tune all embedding function of m3 (dense, sparse and colbert), you can refer to the unified_fine-tuning example
- Downloads last month
- 37
8-bit
Model tree for leeroy-jankins/boogr
Base model
BAAI/bge-small-en-v1.5