The Fine-Tuning Index / RLHF & Preference / #4
modelscope/ms-swift
by modelscope · RLHF & Preference · updated today
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).
79
momentum
14,494
stars
1,475
forks
#4
rank
deepseek-r1embeddinggrpointernvlligerllamallama4llmloramegatronmoemultimodal
View on GitHub →