Action Chunking Transformer
That's a basic model for solving simplest imitation learning tasks. The original implementations can be found here.
The model takes images from one or multiple cameras and robot state and produces a chunk of actions, which robot can execute as a sequence of movements in real world.
The model weights are random and provided only for testing purposes.
How to Use
Installation
uv pip install physicalai numpy
Running Inference
The following API example showcases inference API for this model:
import numpy as np
from physicalai.inference import InferenceModel
model = InferenceModel("act-fp16-ov", device="CPU")
# Build a dummy LIBERO-style observation.
# LIBERO provides two cameras (agentview + wrist) and an 8-dim robot state.
# Images use the LeRobot convention: float32 in [0, 1], shape (C, H, W).
observation = {
"images.image": np.random.rand(1, 3, 256, 256).astype(np.float32),
"images.image2": np.random.rand(1, 3, 256, 256).astype(np.float32),
"state": np.zeros((1, 8), dtype=np.float32),
}
chunk = model.predict_action_chunk(observation)
Note that the model should be downloaded and saved to the act-fp16-ov folder prior to running this script.
Legal information
The original model is distributed under Apache 2.0 license.
Disclaimer
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
- Downloads last month
- 58