The Fine-Tuning Index / RLHF & Preference / #147
NiuTrans/Vision-LLM-Alignment
by NiuTrans · RLHF & Preference · updated 12mo ago
This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
24
momentum
122
stars
9
forks
#147
rank
alignmentdpollama3-visionllavallmmllmmulti-modelpporewardrlhfsftvision
View on GitHub →