Instructions to use Efficient-Large-Model/SANA-WM_bidirectional with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Efficient-Large-Model/SANA-WM_bidirectional with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Efficient-Large-Model/SANA-WM_bidirectional", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
| { | |
| "_class_name": "LTX2TextConnectors", | |
| "_diffusers_version": "0.37.0.dev0", | |
| "audio_connector_attention_head_dim": 128, | |
| "audio_connector_num_attention_heads": 30, | |
| "audio_connector_num_layers": 2, | |
| "audio_connector_num_learnable_registers": 128, | |
| "caption_channels": 3840, | |
| "causal_temporal_positioning": false, | |
| "connector_rope_base_seq_len": 4096, | |
| "rope_double_precision": true, | |
| "rope_theta": 10000.0, | |
| "rope_type": "split", | |
| "text_proj_in_factor": 49, | |
| "video_connector_attention_head_dim": 128, | |
| "video_connector_num_attention_heads": 30, | |
| "video_connector_num_layers": 2, | |
| "video_connector_num_learnable_registers": 128 | |
| } | |