FINE‑TUNING/INDEX

The Fine-Tuning Index / RLHF & Preference / #130

jasonvanf/llama-trl

by jasonvanf · RLHF & Preference · updated 9mo ago

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

27

momentum

239

stars

24

forks

#130

rank

adapterchatgptgptgpt-4llamalorapeftpporlhftransformertrl

View on GitHub →