Instructions to use MERaLiON/MERaLiON-2-10B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MERaLiON/MERaLiON-2-10B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="MERaLiON/MERaLiON-2-10B", trust_remote_code=True)# Load model directly from transformers import AutoModelForSpeechSeq2Seq model = AutoModelForSpeechSeq2Seq.from_pretrained("MERaLiON/MERaLiON-2-10B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Remove flash-attn from requirements and GPU inference example
#3
by YingxuHe - opened
Remove flash-attn as a required dependency and remove attn_implementation="flash_attention_2" from the GPU inference example.
The model works with PyTorch's built-in SDPA attention which is auto-selected by transformers when flash-attn is not installed.
YingxuHe changed pull request status to merged