The Fine-Tuning Index / Fine-Tuning Tools / #59

mst272/LLM-Dojo

by mst272 · Fine-Tuning Tools · updated 3mo ago

轻量级 LLM Post-training 框架,支持 SFT、RLVR、On-Policy KD、Guide KD 及混合训练;实现单轮/多轮 Guide 蒸馏、多教师蒸馏、Reward 混合训练与自动化数据分流👩‍🎓👨‍🎓

48
momentum
939
stars
86
forks
#59
rank
View on GitHub →