model training process setup #121

zzt007 · 2024-03-06T06:42:04Z

Hi , thanks for your great job! I have done all the work mentioned in readme. I also start to train the model, however, the car moves very slowly in rviz and gazebo , and does not run smoothly as in the example ,even needs to wait a minute before changing positon. Is this because of my computer's poor performance?

my computer and environment as follows:

cpu : 13700k i7
no NVIDIA Graphics Card
Windows WSL2 & Ubuntu20.04 ROS noetic

the following two pictures are 1 minute apart
the terminal belike 👇

reiniscimurs · 2024-03-06T07:42:57Z

Hi,

I have deployed this model on i5 and i3 cpus without cuda so the training part where episode is collected is not that resource intensive. If it takes a minute to change a single (or even a couple) step it does not seem normal to me. However, it does seem you are using a virtual machine, and it either might not have enough resources or not configured correctly. So I would suggest checking if any other ros application works there, and if it does not run smoothly, you could know where the problem lies. In any case, this seems more of a hardware issue and I don't think I can help you much there.

zzt007 · 2024-03-06T08:26:20Z

Thank you for your reply. Does that situation mean running the program successfully? I want to debug this program to learn the connection between DRL and ROS&gazebo simulation, and then I could work on my own project.

reiniscimurs · 2024-03-06T09:01:41Z

From what I can see, the software is launched properly.

zzt007 · 2024-03-07T08:25:01Z

Hi there, I want to know how to judge whether the trained model is convergent? This is my first contact with RL training. So what metrics that I need to check? loss curve like DL model? or reward value reaches a stable value?
The following is my terminal information during the training .
-- training start

-- epoch increase

I find that with the increase of epoch, its average rewards becomes negative . Looking forward to your reply.

reiniscimurs · 2024-03-07T21:40:27Z

Hi,

Better indicator would be curves in tensorboard. Evaluation in the beginning in your case probably places the goals really close to the robot so it randomly "collects" them. As training goes on, the goal distance gets increased and situations become more complex to the robot.

Loss is not a good indicator and you can see more on the topic here:
#89 (comment)

Generally I would look for the convergence of the maxQ reward on tensorboard.

zzt007 · 2024-03-08T06:34:50Z

Many thanks for your kind help and share.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model training process setup #121

model training process setup #121

zzt007 commented Mar 6, 2024

reiniscimurs commented Mar 6, 2024

zzt007 commented Mar 6, 2024

reiniscimurs commented Mar 6, 2024

zzt007 commented Mar 7, 2024

reiniscimurs commented Mar 7, 2024

zzt007 commented Mar 8, 2024

model training process setup #121

model training process setup #121

Comments

zzt007 commented Mar 6, 2024

reiniscimurs commented Mar 6, 2024

zzt007 commented Mar 6, 2024

reiniscimurs commented Mar 6, 2024

zzt007 commented Mar 7, 2024

reiniscimurs commented Mar 7, 2024

zzt007 commented Mar 8, 2024