The Fine-Tuning Index / RLHF & Preference / #63
wxhcore/bumblecore
by wxhcore · RLHF & Preference · updated 1mo ago
An LLM training framework built from the ground up, featuring a custom BumbleBee architecture and end-to-end support for multiple open-source models across Pretraining → SFT → RLHF/DPO.
46
momentum
98
stars
13
forks
#63
rank
aideepseekfine-tuninggenerative-aigptinstruction-tuningllmloranlppeftpretrainqwen
View on GitHub →