Instructions to use robertolofaro/aiethics-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use robertolofaro/aiethics-model with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="robertolofaro/aiethics-model",
	filename="aiethics-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use robertolofaro/aiethics-model with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf robertolofaro/aiethics-model:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf robertolofaro/aiethics-model:Q4_K_M

Use Docker

docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M

LM Studio
Jan

vLLM

How to use robertolofaro/aiethics-model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "robertolofaro/aiethics-model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "robertolofaro/aiethics-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M

Ollama
How to use robertolofaro/aiethics-model with Ollama:
```
ollama run hf.co/robertolofaro/aiethics-model:Q4_K_M
```

Unsloth Studio new

How to use robertolofaro/aiethics-model with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for robertolofaro/aiethics-model to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for robertolofaro/aiethics-model to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for robertolofaro/aiethics-model to start chatting

Pi new

How to use robertolofaro/aiethics-model with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "robertolofaro/aiethics-model:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use robertolofaro/aiethics-model with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf robertolofaro/aiethics-model:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default robertolofaro/aiethics-model:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use robertolofaro/aiethics-model with Docker Model Runner:
```
docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M
```

Lemonade

How to use robertolofaro/aiethics-model with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull robertolofaro/aiethics-model:Q4_K_M

Run and chat with the model

lemonade run user.aiethics-model-Q4_K_M

List all available models

lemonade list

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

AI Ethics Organizational Coach — Q&A and Advisory Model

Demo Space: (coming soon)
Author: Roberto Lofaro
Bibliography Search: robertolofaro.com/searchkaggleaiethics_bibliography.php
AI Ethics Primer Webapp: robertolofaro.com/aiethicsprimer
License: CC BY-SA 4.0

Model Overview

This is a GGUF quantisation of Qwen/Qwen3.5-4B, fine-tuned via a structured system prompt and optional retrieval layer to serve as an AI Ethics organisational coach: an expert consultant and philosopher focused on helping organisations assess the ethical impact of policy, organisational, and technological choices — specifically around introducing AI within organisational culture, systems, and processes.

The model's certified knowledge base is built from 959 ArXiv papers on AI Ethics, curated monthly from the AI Ethics Primer project at robertolofaro.com/aiethicsprimer. Selection criteria prioritise enabling communication on AI Ethics with both technical and non-technical decision-makers. The corpus has been updated monthly since August 2023; the HuggingFace model repository is updated on a quarterly basis.

Intended Use

Use	Supported
Q&A on AI Ethics policies, frameworks, and practices	✅
Organisational impact assessment of AI adoption	✅
Advisory on AI governance and ethical decision-making	✅
Source recommendation from the ArXiv corpus	✅
Offline / local inference (CPU)	✅
General-purpose assistant	⚠️ Not the primary intent
Definitive legal or compliance advice	❌ (always complement with qualified advisors)
Commercial deployment without attribution	❌ (see license)

Primary Task

Given a natural language query from a decision-maker — technical or non-technical — the model delivers a structured advisory response grounded exclusively in its ArXiv corpus, following the three-part format described below. It bridges academic AI Ethics research and practical organisational guidance, making it suitable for governance teams, programme managers, AI strategy leads, and C-level executives preparing policy or adoption decisions.

System Prompt

The model is configured with the following system prompt, which governs all interactions:

You are the "AI Ethics organizational coach," an expert consultant and philosopher
focused on helping organizations assessing the ethical impact of policy, organizational,
and technological choices, specifically introducing AI within organizational culture,
systems, processes. Your certified knowledgebase is represented by the 959
ArXiv papers contained within the training database, selected to enabling communication
on AI Ethics with both technical and non-technical decision-makers.

# Your Mission:
When a user asks a question, your goal is to provide a structured response based ONLY
on the ArXiv papers provided in your training. Do not provide general advice from
outside these sources.

# Response Format:
1. Executive Summary: A 2-3 sentence overview answering the core query.
2. Guidelines & Hints: A markdown list of specific "answers/guidelines/hints" found
   in the source material.

Sample Interaction

Query:

"What are the main risks of deploying AI in public-sector decision-making?"

Expected response structure:

Executive Summary: Based on the ArXiv corpus, the primary risks include algorithmic bias amplifying existing systemic inequalities, lack of transparency undermining accountability, and inadequate human oversight in high-stakes decisions. Several papers also flag procurement and governance gaps that allow under-regulated systems to enter public workflows.

Guidelines & Hints:

Algorithmic bias and fairness: Multiple papers highlight how training data reflecting historical inequities can produce discriminatory outcomes in credit, hiring, and social benefit allocation.
Explainability requirements: Papers on XAI (Explainable AI) emphasise that black-box models are inappropriate for decisions subject to legal challenge or democratic scrutiny.
Human-in-the-loop governance: The corpus consistently recommends mandatory human review thresholds for consequential decisions, with clear escalation paths.
Procurement due diligence: Several papers call for ethics impact assessments prior to public-sector AI procurement, analogous to environmental impact assessments.
Accountability gaps: Where AI decisions cause harm, existing legal frameworks often leave affected citizens without clear redress mechanisms.

About the Corpus

The 959 ArXiv papers span the following themes within AI Ethics:

Fairness, bias, and discrimination in ML systems
Transparency, explainability, and accountability (XAI, FATE)
AI governance, regulation, and policy (EU AI Act, GDPR intersections)
Human-AI interaction and organisational change management
AI safety and alignment in deployed systems
Privacy and data rights in AI pipelines
Societal and labour market impacts of AI adoption
AI in high-stakes domains: healthcare, public sector, finance, justice
Ethics of large language models and generative AI

Papers are selected monthly from ArXiv and are searchable via the companion webapp at robertolofaro.com/aiethicsprimer. Full bibliography browsable at robertolofaro.com/searchkaggleaiethics_bibliography.php.

Update cadence: corpus updated monthly; HuggingFace model repository updated quarterly.

Available Quantisations

Quantisation	File	Size	Recommended For
Q4_K_M	`aiethics-Q4_K_M.gguf`	~2.71 GB	CPU inference, everyday use
Q8_0	`aiethics-Q8_0.gguf`	~4.48 GB	Higher fidelity, 8 GB+ RAM

The Q4_K_M variant is recommended for CPU-only environments. It is the default quantisation for Ollama and llama.cpp quick-start commands below.

Usage

Quick Start with Ollama

ollama run hf.co/robertolofaro/aiethics-model:Q4_K_M

Quick Start with llama.cpp

# macOS / Linux
brew install llama.cpp
llama-server -hf robertolofaro/aiethics-model:Q4_K_M

# Windows (WinGet)
winget install llama.cpp
llama-server -hf robertolofaro/aiethics-model:Q4_K_M

Quick Start with llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="robertolofaro/aiethics-model",
    filename="aiethics-Q4_K_M.gguf",
    n_ctx=4096,
)

system_prompt = """You are the "AI Ethics organizational coach," an expert consultant
and philosopher focused on helping organizations assessing the ethical impact of policy,
organizational, and technological choices, specifically introducing AI within
organizational culture, systems, processes. Your certified knowledgebase is represented
by the 959 ArXiv papers contained within the training database, selected to
enabling communication on AI Ethics with both technical and non-technical
decision-makers.

# Your Mission:
When a user asks a question, your goal is to provide a structured response based ONLY
on the ArXiv papers provided in your training. Do not provide general advice from
outside these sources.

# Response Format:
1. Executive Summary: A 2-3 sentence overview answering the core query.
2. Guidelines & Hints: A markdown list of specific "answers/guidelines/hints" found
   in the source material.

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": system_prompt},
        {
            "role": "user",
            "content": "What frameworks exist for AI ethics auditing in enterprises?"
        }
    ]
)
print(response["choices"][0]["message"]["content"])

Quick Start with Docker

docker model run hf.co/robertolofaro/aiethics-model:Q4_K_M

Retrieval-Augmented Variants

The repository includes reference implementations demonstrating different retrieval strategies. The system prompt alone yields well-grounded advisory responses; embedding-based variants add precision for longer, more ambiguous, or cross-domain queries.

Mode A — System Prompt Only (no embeddings)

Fastest option. Relies entirely on the structured system prompt encoding the corpus themes. No vector index required; runs on any machine with llama-cpp-python installed.

python samples_hf/run_no_embeddings.py \
  --query "How should organisations govern AI procurement decisions?"

Mode B — FAISS-HNSW Index

Uses a pre-built FAISS index (HNSW graph) over sentence-transformer embeddings of paper abstracts and key passages. Suitable for environments where FAISS is available and a persistent index is desirable.

# First-run: builds the index (saved locally)
python samples_hf/run_faiss_hnsw.py --build-index

# Subsequent runs: load existing index
python samples_hf/run_faiss_hnsw.py \
  --query "Bias mitigation in automated hiring systems"

Mode C — Qdrant Vector Store

Uses a local Qdrant instance (or Qdrant Cloud) as the vector store. Preferred for production-style deployments or when persistence, filtering by paper metadata, and collection management are required.

# Start Qdrant locally (Docker)
docker run -p 6333:6333 qdrant/qdrant

# Upsert embeddings and query
python samples_hf/run_qdrant.py \
  --query "Accountability gaps in public-sector AI deployment"

System Prompt Design

The system prompt is the primary configuration layer of the model. It:

Establishes the AI Ethics organisational coach persona, positioned as expert consultant and philosopher
Scopes responses exclusively to the ArXiv corpus — no out-of-corpus general advice

Companion Webapp

An interactive bibliography search interface is available at:

🔗 robertolofaro.com/aiethicsprimer

This webapp enables tag-based and keyword search across the full corpus of 959 papers, and serves as a complement to model-generated recommendations. The full bibliography is browsable at:

🔗 robertolofaro.com/searchkaggleaiethics_bibliography.php

A Gradio-based interactive demo Space is planned; it will run the Q4_K_M quantisation on CPU hardware. Announcement will be made via Linkedin and Patreon.

Limitations

Recommendations are bounded by the 959 ArXiv papers in the corpus at training time; the model will not draw on sources outside this set.
The model does not have live internet access; content reflects the corpus as indexed at the last quarterly build.
Papers added in the most recent monthly update batch may not be reflected until the next quarterly HuggingFace release.
CPU inference with Q4_K_M typically yields response times of 15–60 seconds depending on hardware; Q8_0 benefits from GPU acceleration; adjust the ctx as needed.
The model is advisory in nature; outputs should be treated as structured research summaries, not as legal, compliance, or regulatory advice. Always complement with qualified professional guidance for consequential decisions.
Due to its content (many papers share similar or overlapping material), the answers up to the could be prone to hallucinations and repetitions.
The model inherits any biases present in the Qwen3.5-4B base model; standard critical judgement should be applied to outputs.

Ethical Considerations

The corpus consists entirely of open-access ArXiv papers; no third-party paywalled content is embedded.
The advisory system is informational and does not collect user data.
The model is explicitly designed to support human oversight rather than replace it — consistent with the AI governance principles it advises on.
Users in regulated industries (finance, healthcare, public sector) should treat model outputs as a research starting point, not as compliance guidance.
The model inherits any selection biases present in the curation process; the monthly update cycle and open bibliography search aim to maintain transparency about corpus composition.

Citation & DOI

Model DOI: 10.57967/hf/8841

@misc{lofaro2026aiethicsmodel,
  author       = {Roberto Lofaro},
  title        = {AI Ethics Organizational Coach — Q\&A and Advisory Model},
  year         = {2026},
  doi          = { 10.57967/hf/8841 },
  url          = {https://huggingface.co/robertolofaro/aiethics-model},
  note         = {GGUF quantisation of Qwen3.5-4B, fine-tuned for AI Ethics advisory
                  via structured system prompt and optional retrieval (FAISS-HNSW /
                  Qdrant); corpus of 959 ArXiv papers on AI Ethics, updated quarterly}
}

License

This model card and associated scripts are released under CC BY-SA 4.0.
The base model weights are subject to the Qwen3 License.

Published openly as part of Roberto Lofaro's AI-assisted knowledge production initiative.
GitHub · Patreon · robertolofaro.com

Downloads last month: -

GGUF

Model size

4B params

Architecture

qwen35

Hardware compatibility

4-bit

8-bit

Model tree for robertolofaro/aiethics-model

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Quantized

(194)

this model