-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about traffic light #35
Comments
Yes, it learns the "red-stop, green-start" from the expert demonstrations. And I think the current camera setup could capture the traffic light information. But you can also try to add another camera with an explicit traffic light detection module to enhance its ability similar to LAV. Most of the training routes contain junctions with traffic lights, so the traffic light related data is abundant. I think the dataset size is important to learn rules about the traffic light, but we do not have such ablations. |
Thanks for your reply, right now I only train on my own small dataset (about 75K samples) and I haven't feed image to planner decoder directly, I think this is the main reason where my model can't learn to understanding traffic light. :-). I'm gonna try to design similar front view feature extraction network similar to TCP, it seems that ego car is able to learn the "red-stop, green-start" behavior as long as I feed raw image to simple network and train on a relatively big dataset, instead of some complicated design, right? many thanks for your answer~ |
So currently what is the input to your planner decoder if you do not feed the image features to it? |
actually I input (1)other cars and map detection embedding feature which are output by the front backbone and detection head, and (2)some ego car state(including command waypoint and speed), So I think I shouldn't only use the intermediate feature, it seems the raw image is also needed. |
Yes, you need to include information containing traffic light information (like raw images or traffic light detection results) as input. |
Hi~author, I read your code again and notice you use pretrained resnet34 to get image feature. I wanna ask is a pretrained image feature network backbone necessary if I only wanna get traffic light info from front-view? For limit the network size, perhaps a shallow custom-designed network is already enough? Not sure whether you‘ve made such comparison~ |
I think a shallow network would suffice if you have direct supervision on the traffic light states. |
Hi~ author, In my opinion, TCP model directly use raw image and some measurement signal as input, and doesn't consider intermediate perception results. But how does it learn traffic light information? If only rely on expert trajectory samples to train, I think the traffic light is too small in front view such that it's actually hard to learn "red-stop, green-start" behavior?
Besides, does training dataset size has crucial impact on the final performance of understanding traffic light? Whether there are relevant ablation experiments about this?
The text was updated successfully, but these errors were encountered: