The Fine-Tuning Index / RLHF & Preference / #12

shibing624/MedicalGPT

by shibing624 · RLHF & Preference · updated 10d ago

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

momentum

5,511

stars

763

forks

#12

rank

chatgptdpogptllamallmmedicalmedicalgpt

More in RLHF & Preference