The Fine-Tuning Index / RLHF & Preference / #130
jasonvanf/llama-trl
by jasonvanf · RLHF & Preference · updated 9mo ago
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
27
momentum
239
stars
24
forks
#130
rank
adapterchatgptgptgpt-4llamalorapeftpporlhftransformertrl
View on GitHub →