The Fine-Tuning Index / RLHF & Preference / #41

mbzuai-oryx/Awesome-LLM-Post-training

by mbzuai-oryx · RLHF & Preference · updated 2mo ago

Awesome Reasoning LLM Tutorial/Survey/Guide

58
momentum
2,441
stars
164
forks
#41
rank
finelarge-language-modelspost-trainingreasoningreinforcement-learningscaling
View on GitHub →