The Fine-Tuning Index / RLHF & Preference / #139

argilla-io/notus

by argilla-io · RLHF & Preference · updated 2y ago

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

25
momentum
168
stars
14
forks
#139
rank
alignment-handbookdpofine-tuninglm-alignmentpreference-datatrlzephyr
View on GitHub →