Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

Example/preset using a CompoundActionSpace #148

Open
jamescasbon opened this issue Dec 10, 2018 · 6 comments
Open

Example/preset using a CompoundActionSpace #148

jamescasbon opened this issue Dec 10, 2018 · 6 comments
Labels
priority/p3 enhancements not currently in focus or low impact bugs
Projects

Comments

@jamescasbon
Copy link
Contributor

Hi,

I see the starcraft example can use the CompoundActionSpace, but are there any examples of how the agent should be configured? It's unclear to me what presets should support this action space.

thanks!

@zach-nervana
Copy link
Contributor

The parts of the network architecture which deal specifically with the action spaces are called heads. It appears that the only head which supports CompoundActionSpace is PolicyHead. This head is used by the actor critic agent by default so I would expect that it should work. @galleibo-intel may know of other heads which are already compatible.

Is there a particular configuration you are interested in?

@gal-leibovich
Copy link
Contributor

Yep. At the moment, only PolicyHead supports CompoundActionSpace, and it is not in use in any of our existing presets. It is currently merely a infrastructure for for allowing future extensibility. In Starcraft, the goal is to allow the use of the full action space (comparing to what is used in Starcraft's presets - X,Y coordinates to move the troops to).

@jamescasbon
Copy link
Contributor Author

Thank you, thats enough for me to get on with this. I hadn't quite got that this was the head part of the net, but it's obvious in hindsight.

@jamescasbon
Copy link
Contributor Author

Spoke to soon....

I tried....

agent_params = ActorCriticAgentParameters()
agent_params.exploration = {CompoundActionSpace: CategoricalParameters()}
  1. What exploration policies should support compound action spaces?
  2. The ActorCriticAgent inherits from PolicyOptimisationMethod and therefore throws from here https://github.com/NervanaSystems/coach/blob/master/rl_coach/agents/policy_optimization_agent.py#L161 in choose_action

Would adding random_action for the space to actor critic agent be sufficient to get this to work?

@jamescasbon jamescasbon reopened this Dec 11, 2018
@ryanpeach
Copy link
Contributor

Also looking forward to this working.

@scttl scttl added this to To do in Coach Dev Jan 11, 2019
@galnov galnov added the priority/p3 enhancements not currently in focus or low impact bugs label Jan 16, 2019
@scttl scttl moved this from Requires Grooming to Groomed but Not Started in Coach Dev Jan 17, 2019
@rmitsch
Copy link

rmitsch commented Apr 29, 2020

What's the current status on this? Which agent(s)/agent configurations support CompoundActionSpace?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/p3 enhancements not currently in focus or low impact bugs
6 participants