The Fine-Tuning Index / RLHF & Preference / #139
argilla-io/notus
by argilla-io · RLHF & Preference · updated 2y ago
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
25
momentum
168
stars
14
forks
#139
rank
alignment-handbookdpofine-tuninglm-alignmentpreference-datatrlzephyr
View on GitHub →