Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast #3540

Open
nuomizai opened this issue Nov 4, 2020 · 23 comments
Assignees
Labels

Comments

@nuomizai
Copy link

nuomizai commented Nov 4, 2020

This error happened on CARLA server when I use leaderboard and scenario runner to create my A3C training environment. Strangely, it appeared a few hours after the start of training. Does anyone know how to solve that?

@yasser-h-khalil
Copy link

Same Issue occurring exactly as described!
I am using 0.9.10
Have you found a solution to this yet?

@nuomizai
Copy link
Author

nuomizai commented Nov 9, 2020

Same Issue occurring exactly as described!
I am using 0.9.10
Have you found a solution to this yet?

Sorry, @yasser-h-khalil . I haven't found the reason and the solution. I used leaderboard and scenario runner. What't is your setting?

@yasser-h-khalil
Copy link

This is the statement I use to launch the server: DISPLAY= ./CarlaUE4.sh -opengl -carla-port=2000.
I am using RTX5000 with 410.48 driver
It works for hours and then crashes with the following error:

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_castterminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast
terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Signal 6 caught.
Signal 6 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554 
Signal 6 caught.
CommonUnixCrashHandler: Signal=6
Malloc Size=65535 LargeMemoryPoolOffset=131119 
Malloc Size=123824 LargeMemoryPoolOffset=254960 
Engine crash handling finished; re-raising signal 6 for the default handler. Good bye.
Aborted (core dumped)
@nuomizai
Copy link
Author

nuomizai commented Nov 9, 2020

This is the statement I use to launch the server: DISPLAY= ./CarlaUE4.sh -opengl -carla-port=2000.
I am using RTX5000 with 410.48 driver
It works for hours and then crashes with the following error:

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_castterminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast
terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Signal 6 caught.
Signal 6 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554 
Signal 6 caught.
CommonUnixCrashHandler: Signal=6
Malloc Size=65535 LargeMemoryPoolOffset=131119 
Malloc Size=123824 LargeMemoryPoolOffset=254960 
Engine crash handling finished; re-raising signal 6 for the default handler. Good bye.
Aborted (core dumped)

The error is exactly the same with what I met! I used an old version of leaderboard and scenario runner to train my DRL agent in a distributed manner. I used CARLA 0.9.9.3 by the way. Now I use the latest version of leaderboard and scenario runner from leaderboard and scenario runner and CARLA 0.9.10. I will tell you if that works as soon as the training process finished. Hope this will help you if you have the same setting with me!

@yasser-h-khalil
Copy link

Hello @nuomizai, are you using Traffic Manager?

@nuomizai
Copy link
Author

Hello @nuomizai, are you using Traffic Manager?

Hey @yasser-h-khalil , sorry for the delay. Yes, I'm using Traffic Manager. Actually, after I used the lastest version of leaderboard and scenario runner, this error gone. Have you figured out the reason for this error?

@yasser-h-khalil
Copy link

No, I am still facing this issue.

@corkyw10
Copy link
Contributor

@glopezdiest could you follow up on this please?

@raozhongyu
Copy link

I met the same question, have you solve it

@glopezdiest
Copy link
Contributor

Hey, this issue is probably related to this other one, which is a memory leak issue at the LB. We do know that it exists but we haven't found the problem yet

@stale
Copy link

stale bot commented Jul 21, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue has not had recent activity label Jul 21, 2021
@deepcs233
Copy link

met the same question + 1

@stale stale bot removed the stale Issue has not had recent activity label Aug 21, 2021
@qhaas
Copy link
Contributor

qhaas commented Oct 20, 2021

Observed this in the CARLA 0.9.12 container in Ubuntu 18.04 with a consumer Kepler GPU, seems random

@grablerm
Copy link

grablerm commented Jan 7, 2022

i met the same question, is there any solution?

@Kin-Zhang
Copy link

Me too!!!!!

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue has not had recent activity label Apr 16, 2022
@jhih-ching-yeh
Copy link

met the same question + 1
is there any solution?

@stale stale bot removed the stale Issue has not had recent activity label Jun 20, 2022
@hlchen1043
Copy link

Met the same issue. CARLA 0.9.10, RTX 3090, Ubuntu 20.04.

@buesma
Copy link

buesma commented Jan 24, 2023

Same here. CARLA 0.9.13, RTX 3080, Ubuntu 20.04.

@AtongWang
Copy link

I also encountered a situation where I would loop through scenarios in my code, which I believe is a serious bug in CARLA.
CARLA 0.9.10 RTX8000, Ubuntu18.04, Python 3.7

@Unkn0wnH4ck3r
Copy link

Same question here after 1000+ rounds RL training, which i believe is traffic manager error. Any suggestions?
Signal 11 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554
CommonUnixCrashHandler: Signal=11
Malloc Size=131160 LargeMemoryPoolOffset=196744
Malloc Size=131160 LargeMemoryPoolOffset=327928
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Segmentation fault (core dumped)
Any clue why CARLA crashed.

Device Info:
GPU: NVIDIA Titan RTX 24G
RAM: 64G
CPU: i9 9900X

Ubuntu: 20.04.5
CUDA: 11.7
NVIDIA Driver Version: 525.89.02

@CurryChen77
Copy link

I got into the same situation when I tried to train my own RL agent for over 150 epochs. I also used some memory profilers tools, like memory-profiler python mudule and psutil python module, but the memory usage is not growing. So it shouldn't be the problem of memory leaks. Are there any better solutions?
Tested on two machines

  • machine 1:
    Ubuntu20.04
    GPU: NVIDIA RTX Quadro 6000
    Nvidia driver: 535.129.03
  • machine 2:
    Ubuntu20.04
    GPU: NVIDIA RTX 4070
    Nvidia driver: 535.104.05
Snipaste_2023-12-04_09-31-39
@CMakey
Copy link

CMakey commented Dec 28, 2023

the same question , the different is I'm just running the example file , when I run manual_control.py , the UE just crashed and error came.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment