CAR-T2M β XZ-Constrained GPTs
Two variants of a text+root-XZ-trajectory-conditioned motion generator. Same backbone arch as the small unconstrained GPT (4L/512D/8H) plus a 3-layer constraint encoder + cross-attention.
| Variant | Perturbation profile |
|---|---|
xz_v1/net_last.pth |
big (rotate +-90 deg, scale 0.5-2.0, +-1.5m offset) |
xz_smallpert_v1/net_last.pth |
small (rotate +-20 deg, scale 0.7-1.3, jitter sigma=0.15m) |
The "small" profile is the more recent run; the per-keyframe Gaussian jitter is the single most useful signal for breaking the "constraint == GT" trap.
Files
stats/{Mean,Std,ActiveDims,ConstFill}.npyβ dataset stats.vqvae/net_best_fid.pthβ same frozen VQ-VAE as the unconstrained repo (kept here too so this repo is fully self-contained).xz_v1/net_last.pthβ big-perturbation GPT (iter 200k).xz_smallpert_v1/net_last.pthβ small-perturbation GPT.*/run.log,*/latest_iter.txtβ training metadata.
Architecture (constraint side)
The XZ constraint is a list of (frame_idx, x, z) waypoints. Constraint
encoder is a 3-layer 8-head Transformer that produces [B, N, D] memory
read by the GPT cross-attention.
Trainer + constraint synthesizer + encoder code:
models/constraints_xz.py.
Loading
from huggingface_hub import snapshot_download
local = snapshot_download("mpilligua/car-t2m-xz-constrained")
Source code
https://github.com/mpilligua/CAR-T2M (branch refactor).
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support