The Fine-Tuning Index / RLHF & Preference / #48
zht8506/Easy-LLM-Post-Training
by zht8506 · RLHF & Preference · updated 2d ago
Implement popular LLM post-training algorithms (SFT, DFT, DPO, GRPO, etc.) in PyTorch with easy code!
55
momentum
117
stars
10
forks
#48
rank