Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

MultiDiscrete Gym Environments #176

Open
bbalaji-ucsd opened this issue Dec 28, 2018 · 8 comments
Open

MultiDiscrete Gym Environments #176

bbalaji-ucsd opened this issue Dec 28, 2018 · 8 comments
Labels
priority/p3 enhancements not currently in focus or low impact bugs
Projects

Comments

@bbalaji-ucsd
Copy link

Gym API supports MultiDiscrete action spaces:
https://github.com/openai/gym/blob/master/gym/spaces/multi_discrete.py

This is useful when you want to discretize a continuous control problem, a technique common in literature: https://arxiv.org/abs/1808.00177

But MultiDiscrete action spaces are ignored in Coach:
https://github.com/NervanaSystems/coach/blob/master/rl_coach/environments/gym_environment.py#L367

Can you please add support for them?

As a concrete use case, I would like to (independently) discretize the steering and throttle actions in DeepRacer:
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/reinforcement_learning/rl_deepracer_robomaker_coach_gazebo/src/robomaker/environments/deepracer_env.py#L320

@zach-nervana
Copy link
Contributor

I agree it makes sense to add support for MultiDiscrete action spaces in Coach. In cases where an environment already has a continuous action space defined, there are a couple approaches:

  1. Create a gym wrapper which converts continuous action spaces into discrete ones.
  2. Provide a way to specify in coach that a continuous action space should be discretized.
@bbalaji-ucsd
Copy link
Author

Both options look good to me.

I'm concerned about how to map the multiple sets of discrete actions to the neural network outputs. If this get supported cleanly, I don't mind manually discretizing the continuous action spaces.

@zach-nervana
Copy link
Contributor

I agree that there is another change that that this points to. The observation space can already be defined as a dictionary of multiple spaces. It would be nice to have something similar for action spaces. As it stands, gym environments only support BoxActionSpace and DiscreteActionSpace. It looks like coach also supports MultiSelectActionSpace and PartialDiscreteActionSpaceMap. These and look like they might support some special case, but not the general case as is available for observation spaces.

@bbalaji-ucsd
Copy link
Author

Yes, completely agree.

@scttl scttl added this to To do in Coach Dev Jan 10, 2019
@galnov
Copy link
Member

galnov commented Jan 15, 2019

@bbalaji-ucsd check out the BoxDiscretization Action Filter. There's also a sample CARLA preset using it. Is this good enough for your purpose?

@zach-nervana
Copy link
Contributor

@galnov very cool, i didn't know about this, and it does address the case i raised where an environment has a continuous action space that you want discretized. However, the original issue still stands which is that gym environments can define MultiDiscrete action spaces that coach doesn't recognize.

@zach-nervana
Copy link
Contributor

@galnov in gym, MultiDiscrete is an action space defined by a vector nvec. If the vector has value [2, 3, 4], then the action space consists of 3 decisions. The first decision has two possibilities, the second decision has 3, et cetera.

We could support this in coach with CompoundActionSpace in combination with DiscreteActionSpace, however that would also require supporting CompoundActionSpace under all agents. Do you think that is preferable over creating a new MultiDiscreteActionSpace?

@galnov galnov added the priority/p3 enhancements not currently in focus or low impact bugs label Jan 16, 2019
@scttl scttl moved this from Requires Grooming to Groomed but Not Started in Coach Dev Jan 17, 2019
@mattiasmar
Copy link

mattiasmar commented Mar 3, 2019

@galnov Is the acceptance criteria for this issue an agent that implements multi dimensional discrete RL?
Would the paper by Dulac-Arnold et. al be implemented?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/p3 enhancements not currently in focus or low impact bugs
4 participants