Instructions to use stabilityai/stablecode-completion-alpha-3b-4k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stabilityai/stablecode-completion-alpha-3b-4k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stabilityai/stablecode-completion-alpha-3b-4k")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablecode-completion-alpha-3b-4k") model = AutoModelForCausalLM.from_pretrained("stabilityai/stablecode-completion-alpha-3b-4k") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use stabilityai/stablecode-completion-alpha-3b-4k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stabilityai/stablecode-completion-alpha-3b-4k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stabilityai/stablecode-completion-alpha-3b-4k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/stabilityai/stablecode-completion-alpha-3b-4k
- SGLang
How to use stabilityai/stablecode-completion-alpha-3b-4k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stabilityai/stablecode-completion-alpha-3b-4k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stabilityai/stablecode-completion-alpha-3b-4k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stabilityai/stablecode-completion-alpha-3b-4k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stabilityai/stablecode-completion-alpha-3b-4k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use stabilityai/stablecode-completion-alpha-3b-4k with Docker Model Runner:
docker model run hf.co/stabilityai/stablecode-completion-alpha-3b-4k
Activating the model on huggingface-vscode
Hi. Newbie here. Could you please help me to get around the ways of using the model in VS Code?
I'm trying to activate StableCode-Completion-Alpha-3B-4K on https://github.com/huggingface/huggingface-vscode
I have generated a HF token with a write permission and added it in VS Code extension settings. The bigcode/starcoder model works fine.
On the model page I pressed the 'π Deploy' button, selected 'Inference API' item, copied the https://api-inference.huggingface.co/models/stabilityai/stablecode-completion-alpha-3b-4k API endpoint URL and pasted it into the 'Hugging Face Code: Model ID Or Endpoint' settings field.
Unlike with the default bigcode/starcoder model, the extension doesn't seem to work. I tried to restart extensions host and also tried stabilityai/stablecode-completion-alpha-3b-4k as value with no luck.
When I type in my editor, I see π Status bar icon next to Hugging Face Code spinning for a split of a second, and then nothing happens.
The OUTPUT panel shows something like:
INPUT to API: (with parameters {"max_new_tokens":60,"temperature":null,"do_sample":false,"top_p":0.95,"stop":["<|endoftext|>"]})
def main
What am I doing wrong?
Me too, trying to figure out the issue. It works perfectly fine with default model.
I am trying this as well
did anyone figure this out?
xczczxczxcz