Skip to content
This repository has been archived by the owner on Dec 11, 2022. It is now read-only.

Slow training of RL agent on Carla using "CARLA_3_Cameras_DDPG", the cumulative reward is always negative [-174, -7292], and after some episodes (4 hours) the car doesn't move. #218

Open
ucefguns123 opened this issue Feb 4, 2019 · 3 comments
Labels
priority/p2 questions needing answered or medium impact bugs
Projects

Comments

@ucefguns123
Copy link

i'm using a bi-proc Xeon 32core with GTX 1080Ti and the training of RL agent on Carla using "CARLA_3_Cameras_DDPG" is really slow, and the cumulative reward is always negative [-174, -7292], and after some episodes (4 hours) the car doesn't even move.

@ucefguns123 ucefguns123 changed the title very slow training of RL agent on Carla using "CARLA_3_Cameras_DDPG", the cumulative reward is always negative [-174, -7292], and after some episodes (4 hours) the car doesn't even move. Feb 5, 2019
@scttl scttl added this to Requires Grooming in Coach Dev via automation Feb 5, 2019
@scttl scttl added the priority/p2 questions needing answered or medium impact bugs label Feb 5, 2019
@scttl
Copy link
Contributor

scttl commented Feb 5, 2019

Hi, in order to help you out can you provide a bit more info on your setup?

  1. please specify the exact command-line and arguments you use to launch the training
  2. Even though it remains negative, does the reward change per episode or is it fixed at the exact same value? If you can copy/paste some output snippets that would be beneficial
  3. Have you tried running any of the other Carla presets? Do they converge?
@ucefguns123
Copy link
Author

Hi and thank you for help,
These are the requested informations:
1- command used: coach -r -s 10800 -tb -p CARLA_3_Cameras_DDPG
2- the reward changes per episode, but it's always négative
3- yes, for CARLA, i've used DDGP and DUELING DQN, the last one work better

Training Reward

  
-3375.4318577085737
-658.1820359512911
-7292.888087370201
-4194.976244396317
-2313.8986227780583
-2479.7722832604945
-2359.5674526625917
-1592.1974726633464
-2022.9361864142775
-2851.957881139969
-2752.0069561445052
-6531.36169757659
-3160.118598735955
-3206.913054948825
-4817.749369546878
-6764.666624778123
-4220.212154232195
-2814.6266234517548
-3515.7281597536626
 
-3217.5484072529507
-3983.715273880441
-2781.0031403546755
-2582.9778304927263
-1970.7004200074937
-2650.2153545704496
-2709.1529443686263
-4793.143115617591
-4135.73948334726
-3015.250698368811
-2457.6602572066604
-3419.2055087590425
-2577.5702982787034
-2799.191478872708
-3749.5771996646013
-2010.359758612145
-3188.1144677147454
-2520.438351801781
-2879.1281298021877
 
-2447.6234248306228
-3632.606660207143
-2101.073098433635
-4864.339430301097
-1232.0016605225137
-3365.4689117227194
-2707.84076497163
-2641.6557392035456
-3534.609646937069
-3100.1214107375395
-4967.677028469975
-2749.655002127858
-2333.0746405660225
-2419.4950128602263
-3080.5406780568464
-3747.595203179945
-2950.4251160717304
-2417.7432665587694
-4334.234278008437
 
-3104.6161427189004
-4158.8318178195905
-4187.715641482067
-2340.3694156449824
-2476.3408608294767
-2195.6881479710933
-3664.086466143076
-4906.060391674694
-3566.3108913950164
-2820.5014961828474
-3856.1330618598668
-2486.2877289702315
-4118.83379371246
-3981.740076476962
-2738.1312312321575
-4032.9327030921418
-2386.1191360443463
-2353.3606761868828
-3565.8829722816126
 
-2138.6224752614316
-1627.5817477722687
-2853.0524599046685
-3941.5082148490837
-4051.677611110275
-2028.438954986763
-3265.214124584373
-3824.619178381645
-2302.822193759454
-2901.0556972954782
-2613.0453525240055
-4073.0665395648666
-3812.2979067585766
-2098.780390902723
-174.5424602759394
-2872.970626091662
-2689.0528508512866

@galnov
Copy link
Member

galnov commented Feb 27, 2019

This is reproducible, and does not converge even after a few days of training.

@galnov galnov moved this from Requires Grooming to Groomed but Not Started in Coach Dev Feb 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/p2 questions needing answered or medium impact bugs
3 participants