Instructions to use MiaoshouAI/Florence-2-base-PromptGen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MiaoshouAI/Florence-2-base-PromptGen with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="MiaoshouAI/Florence-2-base-PromptGen", trust_remote_code=True)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("MiaoshouAI/Florence-2-base-PromptGen", trust_remote_code=True) model = AutoModelForImageTextToText.from_pretrained("MiaoshouAI/Florence-2-base-PromptGen", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MiaoshouAI/Florence-2-base-PromptGen with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MiaoshouAI/Florence-2-base-PromptGen" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiaoshouAI/Florence-2-base-PromptGen", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/MiaoshouAI/Florence-2-base-PromptGen
- SGLang
How to use MiaoshouAI/Florence-2-base-PromptGen with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MiaoshouAI/Florence-2-base-PromptGen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiaoshouAI/Florence-2-base-PromptGen", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MiaoshouAI/Florence-2-base-PromptGen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiaoshouAI/Florence-2-base-PromptGen", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use MiaoshouAI/Florence-2-base-PromptGen with Docker Model Runner:
docker model run hf.co/MiaoshouAI/Florence-2-base-PromptGen
Florence-2-base-PromptGen
Florence-2-base-PromptGen is a model trained for MiaoshouAI Tagger for ComfyUI. It is an advanced image captioning tool based on the Microsoft Florence-2 Model and fine-tuned to perfection.
Why another tagging model?
Most vision models today are trained mainly for general vision recognition purposes, but when doing prompting and image tagging for model training, the format and details of the captions is quite different.
Florence-2-base-PromptGen is trained on such a purpose as aiming to improve the tagging experience and accuracy of the prompt and tagging job. The model is trained based on images and cleaned tags from Civitai so that the end result for tagging the images are the prompts you use to generate these images.
Instruction prompt:
A new instruction prompt <GENERATE_PROMPT> is created for this purpose in addition to <DETAILED_CAPTION> and <MORE_DETAILED_CAPTION>. It will respond back in danbooru tagging style with much better accuracy and proper level of details.
Version Histroy:
v0.8 New Instruction trained for <GENERATE_PROMPT>
v0.9 Improved vision ability for uncensored data for <DETAILED_CAPTION> and <MORE_DETAILED_CAPTION>
How to use:
To use this model, you can load it directly from the Hugging Face Model Hub:
model = AutoModelForCausalLM.from_pretrained("MiaoshouAI/Florence-2-base-PromptGen", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("MiaoshouAI/Florence-2-base-PromptGen", trust_remote_code=True)
prompt = "<GENERATE_PROMPT>"
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=prompt, images=image, return_tensors="pt").to(device)
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
do_sample=False,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
parsed_answer = processor.post_process_generation(generated_text, task=prompt, image_size=(image.width, image.height))
print(parsed_answer)
Use under MiaoshouAI Tagger ComfyUI
If you just want to use this model, you can use it under ComfyUI-Miaoshouai-Tagger
https://github.com/miaoshouai/ComfyUI-Miaoshouai-Tagger
A detailed use and install instruction is already there.
- Downloads last month
- 334