edgeai-docs-embedding-qwen1.5-0.5b-instruct

its still pretty terrible on performance but I hope to improve it with some training and am looking into an adaptor as an option too preview here:

Model Summary

edgeai-docs-embedding-qwen1.5-0.5b-instruct is a lightweight, instruction-tuned LoRA adapter built on top of Qwen1.5-0.5B.
It is optimized for:

Documentation summarization
Conversational responses
Lightweight embedding-style semantic generation
Edge and local deployment environments

The model is designed to run efficiently in constrained environments and integrates cleanly with local inference stacks such as Ollama and llama.cpp.

Model Details

Model Description

This model applies Parameter-Efficient Fine-Tuning (LoRA) to the base Qwen/Qwen1.5-0.5B model using custom instruction-formatted documentation datasets.

It is tuned for structured reasoning over API documentation, developer content, and technical text. The objective is to provide high-quality summarization and conversational assistance while maintaining a small deployment footprint.

Developed by

Eoin Jordan
Hugging Face: https://huggingface.co/eoinedge

Model Type

Instruction-tuned causal language model (LoRA adapter)

Base Model

Qwen/Qwen1.5-0.5B

Fine-Tuning Method

LoRA (Low-Rank Adaptation) via PEFT

Framework

Hugging Face Transformers
PEFT
Safetensors

Language

English

License

Apache 2.0

Intended Uses

Direct Use

Developer documentation summarization
API explanation generation
Conversational assistant for technical workflows
Lightweight semantic generation for search/retrieval pipelines

Downstream Use

Integrated into RAG (Retrieval-Augmented Generation) systems
Embedded inside edge devices
Wrapped within Ollama or llama.cpp inference environments
Developer tooling assistants

Out-of-Scope Use

High-stakes decision-making systems
Legal or medical advisory systems
Large-scale multi-lingual production deployments
Safety-critical automation

Bias, Risks, and Limitations

Because this model is derived from Qwen1.5 and fine-tuned on technical documentation:

It may inherit biases present in the base model.
It is optimized for technical content and may degrade on general creative tasks.
It may hallucinate undocumented APIs or incorrect implementation details.
It is not designed for factual verification tasks.

Users should validate outputs before use in production systems.

Training Details

Training Data

Custom documentation chunks
API references
Instruction-formatted JSON datasets
Manually validated prompt-response pairs

No personally identifiable information (PII) was intentionally included.

Training Procedure

LoRA fine-tuning via PEFT
Instruction tuning objective
Mixed precision training

Training Regime

bf16 mixed precision

Evaluation

Evaluation was performed using:

Manual prompt benchmarking
Structured documentation summarization tests
API explanation accuracy review

Quantitative benchmarks are limited due to domain-specific fine-tuning focus.

Technical Specifications

Architecture

Transformer decoder-only model
0.5B parameters (base model)
LoRA adapter layers applied to attention modules

Adapter Type

Low-Rank Adaptation (LoRA)

Deployment Targets

Ollama
llama.cpp (via conversion)
Transformers runtime
Edge inference environments

Environmental Impact

Base model reference:
Qwen Technical Report (2023)

Carbon emissions were not separately tracked for this adapter training run.
For estimation guidance, see: https://mlco2.github.io/impact

Citation

If you use this model, please cite:

@misc{edgeai-docs-embedding-qwen1.5-0.5b-instruct,
  author = {Jordan, Eoin},
  title = {edgeai-docs-embedding-qwen1.5-0.5b-instruct},
  year = {2026},
  howpublished = {\url{https://huggingface.co/eoinedge/edgeai-docs-embedding-qwen1.5-0.5b-instruct}}
}

Downloads last month: 1

Model tree for eoinedge/edgeai-docs-embedding-qwen1.5-0.5b-instruct

Base model

Qwen/Qwen1.5-0.5B

Adapter

(32478)

this model