edgeai-docs-embedding-qwen1.5-0.5b-instruct
its still pretty terrible on performance but I hope to improve it with some training and am looking into an adaptor as an option too preview here:
Model Summary
edgeai-docs-embedding-qwen1.5-0.5b-instruct is a lightweight, instruction-tuned LoRA adapter built on top of Qwen1.5-0.5B.
It is optimized for:
- Documentation summarization
- Conversational responses
- Lightweight embedding-style semantic generation
- Edge and local deployment environments
The model is designed to run efficiently in constrained environments and integrates cleanly with local inference stacks such as Ollama and llama.cpp.
Model Details
Model Description
This model applies Parameter-Efficient Fine-Tuning (LoRA) to the base Qwen/Qwen1.5-0.5B model using custom instruction-formatted documentation datasets.
It is tuned for structured reasoning over API documentation, developer content, and technical text. The objective is to provide high-quality summarization and conversational assistance while maintaining a small deployment footprint.
Developed by
Eoin Jordan
Hugging Face: https://huggingface.co/eoinedge
Model Type
Instruction-tuned causal language model (LoRA adapter)
Base Model
Qwen/Qwen1.5-0.5B
Fine-Tuning Method
LoRA (Low-Rank Adaptation) via PEFT
Framework
- Hugging Face Transformers
- PEFT
- Safetensors
Language
English
License
Apache 2.0
Intended Uses
Direct Use
- Developer documentation summarization
- API explanation generation
- Conversational assistant for technical workflows
- Lightweight semantic generation for search/retrieval pipelines
Downstream Use
- Integrated into RAG (Retrieval-Augmented Generation) systems
- Embedded inside edge devices
- Wrapped within Ollama or llama.cpp inference environments
- Developer tooling assistants
Out-of-Scope Use
- High-stakes decision-making systems
- Legal or medical advisory systems
- Large-scale multi-lingual production deployments
- Safety-critical automation
Bias, Risks, and Limitations
Because this model is derived from Qwen1.5 and fine-tuned on technical documentation:
- It may inherit biases present in the base model.
- It is optimized for technical content and may degrade on general creative tasks.
- It may hallucinate undocumented APIs or incorrect implementation details.
- It is not designed for factual verification tasks.
Users should validate outputs before use in production systems.
Training Details
Training Data
- Custom documentation chunks
- API references
- Instruction-formatted JSON datasets
- Manually validated prompt-response pairs
No personally identifiable information (PII) was intentionally included.
Training Procedure
- LoRA fine-tuning via PEFT
- Instruction tuning objective
- Mixed precision training
Training Regime
bf16 mixed precision
Evaluation
Evaluation was performed using:
- Manual prompt benchmarking
- Structured documentation summarization tests
- API explanation accuracy review
Quantitative benchmarks are limited due to domain-specific fine-tuning focus.
Technical Specifications
Architecture
- Transformer decoder-only model
- 0.5B parameters (base model)
- LoRA adapter layers applied to attention modules
Adapter Type
Low-Rank Adaptation (LoRA)
Deployment Targets
- Ollama
- llama.cpp (via conversion)
- Transformers runtime
- Edge inference environments
Environmental Impact
Base model reference:
Qwen Technical Report (2023)
Carbon emissions were not separately tracked for this adapter training run.
For estimation guidance, see: https://mlco2.github.io/impact
Citation
If you use this model, please cite:
@misc{edgeai-docs-embedding-qwen1.5-0.5b-instruct,
author = {Jordan, Eoin},
title = {edgeai-docs-embedding-qwen1.5-0.5b-instruct},
year = {2026},
howpublished = {\url{https://huggingface.co/eoinedge/edgeai-docs-embedding-qwen1.5-0.5b-instruct}}
}
- Downloads last month
- 1
Model tree for eoinedge/edgeai-docs-embedding-qwen1.5-0.5b-instruct
Base model
Qwen/Qwen1.5-0.5B