The Fine-Tuning Index / RLHF & Preference / #12
shibing624/MedicalGPT
by shibing624 · RLHF & Preference · updated 10d ago
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。
73
momentum
5,511
stars
763
forks
#12
rank
chatgptdpogptllamallmmedicalmedicalgpt
View on GitHub →