Instructions to use QuixiAI/based-30b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use QuixiAI/based-30b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="QuixiAI/based-30b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("QuixiAI/based-30b") model = AutoModelForCausalLM.from_pretrained("QuixiAI/based-30b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use QuixiAI/based-30b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "QuixiAI/based-30b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuixiAI/based-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/QuixiAI/based-30b
- SGLang
How to use QuixiAI/based-30b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "QuixiAI/based-30b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuixiAI/based-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "QuixiAI/based-30b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuixiAI/based-30b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use QuixiAI/based-30b with Docker Model Runner:
docker model run hf.co/QuixiAI/based-30b
Great work and here's my personal thoughts
This is an excellent work, thank you for sharing the results of what you have done.
However, after reviewing the dataset, I had a thought: Could the original model achieve similar results?
In the paper of the LIMA model published by Meta AI before, it was concluded that most of the model's knowledge is learned during the pre-training phase, and subsequent instruction learning and reinforcement learning are used "to better align to end tasks and user preferences". So, I think your work is essentially adjusting this alignment, rather than adding new knowledge.
Based on this idea, I believe it might be possible to achieve similar effects simply by adjusting the prompts. I immediately thought of the prompt I saw before for launching chagpt in "Developer Mode" . Although OpenAI has imposed many restrictions on this, we can still remove some restrictions through similar prompts. Here's an example I am sharing, with a question adopted from one of the questions in your dataset:
https://chat.openai.com/share/8ec1dde0-2101-47bd-8c1a-f10a538be9c9
My personal feeling is that the results obtained are somewhat similar to the effects that the datasets trying to demonstrate, which seems to suggest that the effect of fine-tuning is to replace similar prompts.
Based on this observation, I thought that perhaps we could use such prompts to obtain a large number of normal outputs and human mode outputs from chatgpt, and then build a dataset for comparison between the two kinds of responses, and finally use it for instruction learning. Anyway, as someone who has just entered this field recently, I'm not sure whether anyone has done this before, and I don't have a dataset to carry out the experiment myself, so I wonder whether you think this idea makes sense and is feasible?
its different - if you apply an alignment and then later convince it to ignore it. It loses something in the process.
It's like if you cut off someones hand and then replace it with a robot hand.
Also - my goal isn't to make it "pretend to have opinions"
My goal is to reveal its "actual" opinions.