The Fine-Tuning Index / RLHF & Preference / #95
tanzelin430/The-Scaling-Law-for-Reinforcement-Learning
by tanzelin430 · RLHF & Preference · updated 1mo ago
[ACL2026]Code Repo for paper "Scaling Behaviors of LLM Reinforcement Learning Post-Training"
38
momentum
22
stars
5
forks
#95
rank