The Fine-Tuning Index / RLHF & Preference / #41
mbzuai-oryx/Awesome-LLM-Post-training
by mbzuai-oryx · RLHF & Preference · updated 2mo ago
Awesome Reasoning LLM Tutorial/Survey/Guide
58
momentum
2,441
stars
164
forks
#41
rank
finelarge-language-modelspost-trainingreasoningreinforcement-learningscaling
View on GitHub →