The Fine-Tuning Index / RLHF & Preference / #4

modelscope/ms-swift

by modelscope · RLHF & Preference · updated today

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

momentum

14,494

stars

1,475

forks

rank

deepseek-r1embeddinggrpointernvlligerllamallama4llmloramegatronmoemultimodal

View on GitHub →

modelscope/ms-swift

More in RLHF & Preference