Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Chintu Kumar's picture

1 1

Chintu Kumar

chang2394

chang2394
chang2394

AI & ML interests

None yet

Organizations

None yet

Collections 7

Off policy/entropy

Efficient RL Training for LLMs with Experience Replay

Paper • 2604.08706 • Published about 1 month ago • 21

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 231

Off policy/entropy

Efficient RL Training for LLMs with Experience Replay

Paper • 2604.08706 • Published about 1 month ago • 21

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 231

View 7 collections

models 0

None public yet

datasets 0

None public yet

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs