The Fine-Tuning Index / Fine-Tuning Tools / #59
mst272/LLM-Dojo
by mst272 · Fine-Tuning Tools · updated 3mo ago
轻量级 LLM Post-training 框架,支持 SFT、RLVR、On-Policy KD、Guide KD 及混合训练;实现单轮/多轮 Guide 蒸馏、多教师蒸馏、Reward 混合训练与自动化数据分流👩🎓👨🎓
48
momentum
939
stars
86
forks
#59
rank