SOD (Step-wise On-policy Distillation) model family for small language model agents.
Young Zhong
youngzhong
·
AI & ML interests
None yet
Recent Activity
updated a model about 11 hours ago
youngzhong/SOD-GRPO_teacher-4B updated a model about 11 hours ago
youngzhong/SOD-1.7B updated a model about 11 hours ago
youngzhong/SOD-0.6B