CAR-T2M β€” XZ-Constrained GPTs

Two variants of a text+root-XZ-trajectory-conditioned motion generator. Same backbone arch as the small unconstrained GPT (4L/512D/8H) plus a 3-layer constraint encoder + cross-attention.

Variant Perturbation profile
xz_v1/net_last.pth big (rotate +-90 deg, scale 0.5-2.0, +-1.5m offset)
xz_smallpert_v1/net_last.pth small (rotate +-20 deg, scale 0.7-1.3, jitter sigma=0.15m)

The "small" profile is the more recent run; the per-keyframe Gaussian jitter is the single most useful signal for breaking the "constraint == GT" trap.

Files

  • stats/{Mean,Std,ActiveDims,ConstFill}.npy β€” dataset stats.
  • vqvae/net_best_fid.pth β€” same frozen VQ-VAE as the unconstrained repo (kept here too so this repo is fully self-contained).
  • xz_v1/net_last.pth β€” big-perturbation GPT (iter 200k).
  • xz_smallpert_v1/net_last.pth β€” small-perturbation GPT.
  • */run.log, */latest_iter.txt β€” training metadata.

Architecture (constraint side)

The XZ constraint is a list of (frame_idx, x, z) waypoints. Constraint encoder is a 3-layer 8-head Transformer that produces [B, N, D] memory read by the GPT cross-attention.

Trainer + constraint synthesizer + encoder code: models/constraints_xz.py.

Loading

from huggingface_hub import snapshot_download
local = snapshot_download("mpilligua/car-t2m-xz-constrained")

Source code

https://github.com/mpilligua/CAR-T2M (branch refactor).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support