The Fine-Tuning Index / RLHF & Preference / #130

jasonvanf/llama-trl

by jasonvanf · RLHF & Preference · updated 9mo ago

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

27
momentum
239
stars
24
forks
#130
rank
adapterchatgptgptgpt-4llamalorapeftpporlhftransformertrl
View on GitHub →