Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Open-Assistant/model/model_training/utils/ppo_utils.py
Lines 574 to 582 in 70f30a6
Request to add **kwargs to CustomPromptPipeline init method to be flexible with trlx library version
The following error occurred while trying RLHF using trainer_rl.py
As a result of analyzing the cause, a parameter called add_special_tokens was added to the get_pipeline function in trlx.py of the main version of the trlx library.
https://github.com/CarperAI/trlx/blob/355c9741f2e606de796f5c6f9b682f7dd00f97c5/trlx/trlx.py#L122-L125C6
However, the parameter does not yet exist in the CustomPromptPipeline class of the main branch of OA.
There is an add_speical_token parameter in the get_pipeline method of trlx.py in the main branch of trlx, but that parameter does not exist in the v.0.6.0 version.
So I think we need to add **kwarg to prevent errors in different versions.
The add_speical_tokens parameter is true only when the architecture of the model is a seq2seq model, but previous OA versions always set it to False, so adding it as **kwarg seems to be okay for now.