The Fine-Tuning Index / RLHF & Preference / #147

NiuTrans/Vision-LLM-Alignment

by NiuTrans · RLHF & Preference · updated 12mo ago

This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.

momentum

122

stars

forks

#147

rank

alignmentdpollama3-visionllavallmmllmmulti-modelpporewardrlhfsftvision

More in RLHF & Preference