Instructions to use RayyanAhmed9477/Health-Chatbot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RayyanAhmed9477/Health-Chatbot with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RayyanAhmed9477/Health-Chatbot")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("RayyanAhmed9477/Health-Chatbot", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use RayyanAhmed9477/Health-Chatbot with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RayyanAhmed9477/Health-Chatbot"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RayyanAhmed9477/Health-Chatbot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RayyanAhmed9477/Health-Chatbot

SGLang

How to use RayyanAhmed9477/Health-Chatbot with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RayyanAhmed9477/Health-Chatbot" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RayyanAhmed9477/Health-Chatbot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RayyanAhmed9477/Health-Chatbot" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RayyanAhmed9477/Health-Chatbot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use RayyanAhmed9477/Health-Chatbot with Docker Model Runner:
```
docker model run hf.co/RayyanAhmed9477/Health-Chatbot
```

Health Chatbot

Welcome to the official Hugging Face repository for Health Chatbot, a conversational AI model fine-tuned to assist with health-related queries. This model is based on LLaMA 3.2, fine-tuned using QLoRA for lightweight and efficient training.

Overview

Health Chatbot is designed to provide accurate and conversational responses for general health advice and wellness information. The model is intended for educational purposes and is not a substitute for professional medical consultation.

Key Features:

Fine-tuned using QLoRA for parameter-efficient training.
Trained on a diverse dataset of health-related queries and answers.
Optimized for conversational and empathetic interactions.

Model Details

Base Model: LLaMA 3.2
Training Method: QLoRA (Quantized Low-Rank Adaptation)
Dataset: Custom curated dataset comprising publicly available health resources, FAQs, and synthetic dialogues.
Intended Use: Conversational health assistance and wellness education.

How to Use the Model

You can load and use the model in your Python environment with the transformers library:

Installation

Make sure you have the necessary dependencies installed:

pip install transformers accelerate bitsandbytes

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("RayyanAhmed9477/Health-Chatbot")
model = AutoModelForCausalLM.from_pretrained(
    "RayyanAhmed9477/Health-Chatbot",
    device_map="auto",
    load_in_8bit=True
)

# Generate a response
def chat(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_length=150, do_sample=True, temperature=0.7)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response

# Example usage
prompt = "What are some common symptoms of the flu?"
print(chat(prompt))

Fine-Tuning the Model

If you want to fine-tune the model further on a custom dataset, follow the steps below.

Requirements

pip install datasets peft

Dataset Preparation

Prepare your dataset in a JSON or CSV format with input and output fields:

Example Dataset (JSON):

[
    {"input": "What are some symptoms of dehydration?", "output": "Symptoms include dry mouth, fatigue, and dizziness."},
    {"input": "How can I boost my immune system?", "output": "Eat a balanced diet, exercise regularly, and get enough sleep."}
]

Training Script

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import prepare_model_for_int8_training, LoraConfig, get_peft_model
from datasets import load_dataset

# Load the base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("RayyanAhmed9477/Health-Chatbot")
model = AutoModelForCausalLM.from_pretrained(
    "RayyanAhmed9477/Health-Chatbot",
    device_map="auto",
    load_in_8bit=True
)

# Prepare model for training
model = prepare_model_for_int8_training(model)

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    bias="none",
    task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)

# Load your custom dataset
data = load_dataset("json", data_files="your_dataset.json")

# Fine-tune the model
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=4,
    num_train_epochs=3,
    logging_dir="./logs",
    save_strategy="epoch",
    evaluation_strategy="epoch",
    learning_rate=1e-4,
    fp16=True
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=data["train"]
)

trainer.train()

# Save the fine-tuned model
model.save_pretrained("./fine_tuned_health_chatbot")
tokenizer.save_pretrained("./fine_tuned_health_chatbot")

Model Evaluation

Evaluate the model's performance using metrics like perplexity and BLEU:

from datasets import load_metric

# Load evaluation dataset
eval_data = load_dataset("json", data_files="evaluation_dataset.json")

# Evaluate with perplexity
def compute_perplexity(model, dataset):
    metric = load_metric("perplexity")
    results = metric.compute(model=model, dataset=dataset)
    return results

print(compute_perplexity(model, eval_data["test"]))

Limitations and Warnings

The model is not a substitute for professional medical advice.
Responses are generated based on patterns in the training data and may not always be accurate or up-to-date.

Contributing

Contributions are welcome! If you have suggestions, improvements, or issues to report, please create a pull request or an issue in this repository.

License

This model is released under the Apache 2.0 License.

Contact

For any queries or collaborations, reach out to me via GitHub or email at rayyanahmed265@yahoo.com, LinkedIn .

Acknowledgements

Special thanks to the Hugging Face and Meta AI teams for their open-source contributions to the NLP and machine learning community.

Downloads last month: 6

Safetensors

Model size

14.5M params

Tensor type

F32

Model tree for RayyanAhmed9477/Health-Chatbot

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

(1598)

this model