The Fine-Tuning Index / RLHF & Preference / #48

zht8506/Easy-LLM-Post-Training

by zht8506 · RLHF & Preference · updated 2d ago

Implement popular LLM post-training algorithms (SFT, DFT, DPO, GRPO, etc.) in PyTorch with easy code!

55
momentum
117
stars
10
forks
#48
rank
View on GitHub →