Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Xiaoyang Cao's picture
5

Xiaoyang Cao

Sean13
·
https://xiaoyangcao1113.github.io/
  • XiaoyangCao1113
  • xiaoyangcao

AI & ML interests

RLFH, Deep Reinfrocement Learning

Recent Activity

updated a model 1 day ago
Sean13/responsibility-decomposition
published a model 2 days ago
Sean13/responsibility-decomposition
updated a model 19 days ago
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
View all activity

Organizations

None yet

upvoted a paper 6 months ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published Nov 11, 2025 • 110
upvoted 3 papers 8 months ago

RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

Paper • 2510.06710 • Published Oct 8, 2025 • 43

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3, 2025 • 99

Latent Collective Preference Optimization: A General Framework for Robust LLM Alignment

Paper • 2509.24159 • Published Sep 29, 2025 • 1
upvoted a paper 12 months ago

VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments

Paper • 2506.02387 • Published Jun 3, 2025 • 58
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs