Reward Model Scoring while SFTuning #3260

rkchee · 2023-05-30T03:40:19Z

Able to run the reward model during Supervise fine tuning
Able to choose the sampling parameters Nucleus9, K50, Greedy

rkchee added 4 commits May 29, 2023 13:40

getting to work

6c6d223

first PR to the group

48d7c84

fixed a comma by the tokenizer

c9358eb

pre-commit completed

fb31775

rkchee requested review from theblackcat102, sanagno, dvruette, andreaskoepf and yk as code owners May 30, 2023 03:40

andreaskoepf added the ml label May 30, 2023

rkchee mentioned this pull request Jun 6, 2023

Add reward model scoring during SFT evaluation #2705

Open

Provide feedback