Upload CodeLLaMa7B-FineTuned-byMoomen fine-tuned model

3a08249 verified 11 months ago

3.8 kB

	---
	license: llama2
	base_model: codellama/CodeLlama-7b-Instruct-hf
	tags:
	- fine-tuned
	- educational
	- qa
	- code
	- llama
	- peft
	- lora
	language:
	- en
	pipeline_tag: text-generation
	library_name: peft
	---

	# CodeLLaMa7B-FineTuned-byMoomen

	This model is a fine-tuned version of [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) using LoRA (Low-Rank Adaptation) for educational Q&A tasks.

	## Model Details

	- Base Model: codellama/CodeLlama-7b-Instruct-hf
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- LoRA Rank: 32
	- LoRA Alpha: 64
	- Target Modules: ['gate_proj', 'lm_head', 'k_proj', 'q_proj', 'up_proj', 'down_proj', 'v_proj', 'o_proj']
	- Training Focus: Educational programming Q&A
	- Model Type: Causal Language Model

	## Usage

	### Quick Start

	```python
	from peft import AutoPeftModelForCausalLM
	from transformers import AutoTokenizer

	# Load model and tokenizer
	model = AutoPeftModelForCausalLM.from_pretrained("Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen")
	tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-Instruct-hf")

	# Generate response
	prompt = "Explain recursion in programming"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=300, temperature=0.7)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Chat Format Usage

	```python
	# For educational Q&A conversations
	messages = [
	{"role": "system", "content": "You are a helpful educational assistant."},
	{"role": "user", "content": "What is the difference between lists and tuples in Python?"}
	]

	formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(formatted_prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=300)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	### Memory-Efficient Loading

	```python
	# For systems with limited VRAM
	from transformers import BitsAndBytesConfig

	quantization_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_compute_dtype=torch.float16
	)

	model = AutoPeftModelForCausalLM.from_pretrained(
	"Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen",
	quantization_config=quantization_config,
	device_map="auto"
	)
	```

	## Training Details

	This model was fine-tuned using:
	- Parameter-Efficient Fine-Tuning (PEFT) with LoRA
	- Educational conversation dataset focused on programming concepts
	- Optimized for Q&A format with system/user/assistant roles

	## Intended Use

	This model is designed for:
	- 📚 Educational programming Q&A
	- 💡 Concept explanations in computer science
	- 🔧 Code debugging assistance
	- 🎓 Technical tutoring and learning support

	## Limitations

	- Based on codellama/CodeLlama-7b-Instruct-hf, inherits its limitations
	- Optimized for educational content, may not perform well on other tasks
	- Requires base model for inference (LoRA adapters only)
	- Performance depends on the quality of training data

	## Model Architecture

	This is a LoRA adapter that needs to be loaded with the base model. The adapter files are:
	- `adapter_config.json`: LoRA configuration
	- `adapter_model.safetensors`: Trained LoRA weights

	## License

	This model follows the same license as the base model: Llama 2 Custom License.

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{CodeLLaMa7B_FineTuned_byMoomen,
	title={CodeLLaMa7B-FineTuned-byMoomen},
	author={Moomen123Msaadi},
	year={2024},
	publisher={Hugging Face},
	url={https://huggingface.co/Moomen123Msaadi/CodeLLaMa7B-FineTuned-byMoomen}
	}
	```