Instructions to use q-future/one-align with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use q-future/one-align with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="q-future/one-align", trust_remote_code=True) pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("q-future/one-align", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Fix SDPA & Flash-Attention
#7
by Agnellino - opened
This PR aims at solving two issues with SDPA and flash attention.
- SDPA uses
_unmask_unattendedmethod of theAttentionMaskConverterbut this function appears nowhere. It is added in this PR. - Flash attention uses
_get_unpad_datafromtransformers.models.llama.modeling_llama, but the star import does not include it for more recent versions of transformers (>=4.48.0).
The implementation of _unmask_unattended is a raw copy-paste of the implementation given in there, so nothing fancy to worry about: https://github.com/huggingface/transformers/blob/v4.37.0/src/transformers/modeling_attn_mask_utils.py#L189
I don't know why but it seems that a lot of lines of code are changed... it's not the case, simply an import of
_get_unpad_dataand the implementation of_unmask_unattended.
Agnellino changed pull request status to open