FINE‑TUNING/INDEX

The Fine-Tuning Index / Training Frameworks / #152

RLHFlow/Reinforce-Ada

by RLHFlow · Training Frameworks · updated 6mo ago

An adaptive sampling framework for Reinforce-style LLM post training.

22

momentum

96

stars

17

forks

#152

rank

View on GitHub →