Representation over Routing: Diagnosing Temporal Routing Pathologies in Multi-Timescale PPO
This model repository hosts pretrained PyTorch actor weights for the diagnostic study "Representation over Routing: Diagnosing Temporal Routing Pathologies in Multi-Timescale PPO".
The weights correspond to controlled PPO experiments on LunarLander-v2. They are provided to reproduce the qualitative behaviors discussed in the paper: a single-horizon baseline, differentiable temporal routing, error-based temporal routing, and Target Decoupling.
This model repository is a weight distribution package. Training scripts and selected generated figures live in the GitHub code repository; paper text and source files are distributed through arXiv.
Related Links
- Paper: https://arxiv.org/abs/2604.13517
- Interactive Demo Space: https://huggingface.co/spaces/ben-dlwlrma/Representation-Over-Routing-Demo
- GitHub Repository: https://github.com/ben-dlwlrma/Representation-Over-Routing
Model Weights Overview
The repository provides four standalone .pth actor weight files:
1_baseline.pth(Baseline PPO): single-horizon PPO reference policy.2_surrogate_hacking_attention.pth(Differentiable Routing): policy from the actor-side attention routing diagnostic.3_temporal_paradox_variance.pth(Error-Based Routing): policy from the gradient-free error-based routing diagnostic.4_target_decoupling_final.pth(Target Decoupling): policy trained with structural separation between the actor objective and temporal routing. The actor uses the long-horizon advantage, while auxiliary critic heads remain as regularizers during training.
Target Decoupling is described in the paper as a structural isolation principle in the LunarLander-v2 PPO setting. The reported evidence concerns removal of the actor-side routing pathway and improved observed worst-seed return in the tested run set, not broad benchmark superiority.
Usage
For training scripts and selected diagnostic plots, see the GitHub repository. The manuscript itself is distributed through arXiv rather than duplicated as source files in the code or model repositories.
The published weights contain actor parameters and can be loaded into the same MLP actor architecture used by the training scripts:
import torch
import torch.nn as nn
import numpy as np
import gymnasium as gym
from huggingface_hub import hf_hub_download
weight_path = hf_hub_download(
repo_id="ben-dlwlrma/Representation-Over-Routing",
filename="4_target_decoupling_final.pth",
)
def layer_init(layer, std=np.sqrt(2), bias_const=0.0):
nn.init.orthogonal_(layer.weight, std)
nn.init.constant_(layer.bias, bias_const)
return layer
actor = nn.Sequential(
layer_init(nn.Linear(8, 64)),
nn.Tanh(),
layer_init(nn.Linear(64, 64)),
nn.Tanh(),
layer_init(nn.Linear(64, 4), std=0.01),
)
actor.load_state_dict(torch.load(weight_path, weights_only=True))
actor.eval()
env = gym.make("LunarLander-v2")
state, _ = env.reset()
done = False
while not done:
state_tensor = torch.FloatTensor(state).unsqueeze(0)
with torch.no_grad():
logits = actor(state_tensor)
action = torch.argmax(logits, dim=1).item()
state, reward, terminated, truncated, _ = env.step(action)
done = terminated or truncated
The paper experiments were conducted on LunarLander-v2. The hosted demo may use LunarLander-v3 for compatibility with current Gymnasium releases while preserving the same actor architecture and weight format.
Citation
@misc{sunRepresentationRoutingDiagnosing2026,
title = {Representation over {{Routing}}: {{Diagnosing Temporal Routing Pathologies}} in {{Multi-Timescale PPO}}},
shorttitle = {Representation over {{Routing}}},
author = {Sun, Jing},
year = 2026,
publisher = {arXiv},
doi = {10.48550/ARXIV.2604.13517},
urldate = {2026-04-16},
copyright = {Creative Commons Attribution 4.0 International},
keywords = {Artificial Intelligence (cs.AI),FOS: Computer and information sciences,Machine Learning (cs.LG)}
}