Perception Encoder
Collection
OpenCLIP (PE Core image + text) and timm PE Core, Spatial, Lang (ViT only) weights. NOTE: These weights do not work with original modeling code. • 19 items • Updated • 7
How to use timm/vit_pe_spatial_large_patch14_448.fb with timm:
import timm
model = timm.create_model("hf_hub:timm/vit_pe_spatial_large_patch14_448.fb", pretrained=True)How to use timm/vit_pe_spatial_large_patch14_448.fb with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("image-feature-extraction", model="timm/vit_pe_spatial_large_patch14_448.fb") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("timm/vit_pe_spatial_large_patch14_448.fb", dtype="auto")