-
Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization
Paper • 2505.16467 • Published -
IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
Paper • 2506.02449 • Published • 1 -
Localizing Persona Representations in LLMs
Paper • 2505.24539 • Published -
Persona Vectors: Monitoring and Controlling Character Traits in Language Models
Paper • 2507.21509 • Published • 34
Collections
Discover the best community collections!
Collections including paper arxiv:2604.01202
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 326 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 16 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Paper • 2402.05140 • Published • 23 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 21 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 61 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 84
-
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Paper • 2602.10693 • Published • 220 -
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
Paper • 2603.09229 • Published • 82 -
DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
Paper • 2603.11076 • Published • 5 -
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
Paper • 2603.21065 • Published • 77
-
Magistral
Paper • 2506.10910 • Published • 68 -
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
Paper • 2506.15882 • Published • 2 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 136 -
The Invisible Leash: Why RLVR May Not Escape Its Origin
Paper • 2507.14843 • Published • 85
-
Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization
Paper • 2505.16467 • Published -
IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
Paper • 2506.02449 • Published • 1 -
Localizing Persona Representations in LLMs
Paper • 2505.24539 • Published -
Persona Vectors: Monitoring and Controlling Character Traits in Language Models
Paper • 2507.21509 • Published • 34
-
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Paper • 2602.10693 • Published • 220 -
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
Paper • 2603.09229 • Published • 82 -
DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
Paper • 2603.11076 • Published • 5 -
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
Paper • 2603.21065 • Published • 77
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 326 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 16 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
Magistral
Paper • 2506.10910 • Published • 68 -
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
Paper • 2506.15882 • Published • 2 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 136 -
The Invisible Leash: Why RLVR May Not Escape Its Origin
Paper • 2507.14843 • Published • 85
-
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Paper • 2402.05140 • Published • 23 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 21 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 61 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 84