The Fine-Tuning Index / RLHF & Preference / #139

argilla-io/notus

by argilla-io · RLHF & Preference · updated 2y ago

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

momentum

168

stars

forks

#139

rank

alignment-handbookdpofine-tuninglm-alignmentpreference-datatrlzephyr

More in RLHF & Preference