Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galactic + Protobuf typesupport: Runtime crash "File already exists in database" #12

Open
sebastian-freitag opened this issue Mar 15, 2022 · 5 comments

Comments

@sebastian-freitag
Copy link

ROS2 Galactic
Ubuntu 20.04
rmw_ecal master
rosidl_typesupport_protobuf master
ecal v5.10.0-alpha-10-g221e5b7e (11.03.2022)

$ protoc --version
libprotoc 3.17.1

When I run simple examples, I see something like this and the process ends:

[INFO] [talker]: process started with pid [2977830]
[talker] [libprotobuf ERROR google/protobuf/descriptor_database.cc:641] File already exists in database: builtin_interfaces/msg/Duration.proto
[talker] [libprotobuf FATAL google/protobuf/descriptor.cc:1371] CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size): 
[talker] terminate called after throwing an instance of 'google::protobuf::FatalException'
[talker]   what():  CHECK failed: GeneratedDatabase()->Add(encoded_file_descriptor, size):
@sebastian-freitag
Copy link
Author

I also tried with

$ protoc --version
libprotoc 3.18.2

Same issue.

@FlorianReimold
Copy link
Member

FlorianReimold commented Mar 16, 2022

Hi Sebastian,

Good Info: It's not your fault.
Bad Info: I wish I had a solution, but I don't. You can use the dynamic typesupport, but the protobuf typesupport is just broken because of this issue.

I debugged that issue in the past, so let me share what I got back then. Unfortunately I wasn't skilled enough with rmw implementation to fix the issue, but I can say what one has to change. If you are skilled with RMW implementation we are grateful for any help.

The crucial thing is the following message from your console:

File already exists in database: builtin_interfaces/msg/Duration.proto

Protobuf has that unique behavior that it stores all descriptor information in a global database (for each process). So, when your process loads multiple shared objects each of them will add their protobuf descriptors in that database. Protobuf will not allow duplicates.

If you let gdb output the loaded libraries at the time of the crash, you will get something similar to the following:

(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007ffff7fd0100 0x00007ffff7ff2674 Yes (*) /lib64/ld-linux-x86-64.so.2
0x00007ffff7e80970 0x00007ffff7f70a73 Yes (*) /opt/ros/galactic/lib/librclcpp.so
0x00007ffff7d9e400 0x00007ffff7d9ec39 Yes (*) /opt/ros/galactic/lib/libstd_msgs__rosidl_typesupport_cpp.so
0x00007ffff7d62160 0x00007ffff7d82195 Yes (*) /opt/ros/galactic/lib/librcl.so
0x00007ffff7d4e500 0x00007ffff7d511b0 Yes (*) /opt/ros/galactic/lib/librmw.so
0x00007ffff7d37f20 0x00007ffff7d42232 Yes (*) /opt/ros/galactic/lib/librcutils.so
0x00007ffff7d2e040 0x00007ffff7d2e25c Yes (*) /opt/ros/galactic/lib/libtracetools.so
0x00007ffff7baf160 0x00007ffff7c97452 Yes (*) /lib/x86_64-linux-gnu/libstdc++.so.6
0x00007ffff7af95e0 0x00007ffff7b0a045 Yes (*) /lib/x86_64-linux-gnu/libgcc_s.so.1
0x00007ffff7927630 0x00007ffff7a9c20d Yes /lib/x86_64-linux-gnu/libc.so.6
0x00007ffff78e6ae0 0x00007ffff78f64d5 Yes /lib/x86_64-linux-gnu/libpthread.so.0
0x00007ffff78d6810 0x00007ffff78da2bb Yes (*) /opt/ros/galactic/lib/libament_index_cpp.so
0x00007ffff78ce3c0 0x00007ffff78cffb0 Yes (*) /opt/ros/galactic/lib/liblibstatistics_collector.so
0x00007ffff78c1800 0x00007ffff78c6881 Yes (*) /opt/ros/galactic/lib/librcl_yaml_param_parser.so
0x00007ffff78bb060 0x00007ffff78bb183 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rosgraph_msgs/lib/librosgraph_msgs__rosidl_typesupport_cpp.so
0x00007ffff78b40a0 0x00007ffff78b4297 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/statistics_msgs/lib/libstatistics_msgs__rosidl_typesupport_cpp.so
0x00007ffff78ab320 0x00007ffff78abcd5 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rcl_interfaces/lib/librcl_interfaces__rosidl_typesupport_cpp.so
0x00007ffff789d3c0 0x00007ffff78a098b Yes (*) /opt/ros/galactic/lib/librmw_implementation.so
0x00007ffff7896120 0x00007ffff7896544 Yes (*) /opt/ros/galactic/lib/librcl_logging_interface.so
0x00007ffff7889d00 0x00007ffff788f047 Yes (*) /opt/ros/galactic/lib/librcpputils.so
0x00007ffff77443c0 0x00007ffff77eaf18 Yes /lib/x86_64-linux-gnu/libm.so.6
0x00007ffff772e2a0 0x00007ffff772f21e Yes (*) /opt/ros/galactic/lib/librosidl_typesupport_cpp.so
0x00007ffff7725040 0x00007ffff772589c Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rcl_interfaces/lib/librcl_interfaces__rosidl_typesupport_c.so
0x00007ffff771b4a0 0x00007ffff771d304 Yes (*) /opt/ros/galactic/lib/librcl_logging_spdlog.so
0x00007ffff770ae60 0x00007ffff7710ef5 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rcl_interfaces/lib/librcl_interfaces__rosidl_generator_c.so
0x00007ffff76fd420 0x00007ffff76ff21b Yes (*) /opt/ros/galactic/lib/librosidl_runtime_c.so
0x00007ffff76f4220 0x00007ffff76f5179 Yes /lib/x86_64-linux-gnu/libdl.so.2
0x00007ffff76d2480 0x00007ffff76eb6ff Yes (*) /opt/ros/galactic/lib/libyaml.so
0x00007ffff76cb2a0 0x00007ffff76cc358 Yes (*) /opt/ros/galactic/lib/librosidl_typesupport_c.so
0x00007ffff7653c80 0x00007ffff76b181a Yes (*) /lib/x86_64-linux-gnu/libspdlog.so.1
0x00007ffff761d1e0 0x00007ffff761d9d7 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/builtin_interfaces/lib/libbuiltin_interfaces__rosidl_generator_c.so
0x00007ffff7d23da0 0x00007ffff7d268e9 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rmw_ecal_proto_cpp/lib/librmw_ecal_proto_cpp.so
0x00007ffff7572850 0x00007ffff75d69a8 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rmw_ecal_shared_cpp/lib/librmw_ecal_shared_cpp.so
0x00007ffff7d19140 0x00007ffff7d1932d Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rosidl_typesupport_protobuf_c/lib/librosidl_typesupport_protobuf_c.so
0x00007ffff7d136e0 0x00007ffff7d143b0 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rosidl_typesupport_protobuf_cpp/lib/librosidl_typesupport_protobuf_cpp.so
0x00007ffff735ae60 0x00007ffff743b2d5 Yes (*) /lib/x86_64-linux-gnu/libecal_core.so.5
0x00007ffff70a3e40 0x00007ffff724afdb Yes (*) /lib/x86_64-linux-gnu/libprotobuf.so.17
0x00007ffff7d09720 0x00007ffff7d0cd70 Yes /lib/x86_64-linux-gnu/librt.so.1
0x00007ffff6fca280 0x00007ffff6fdae2b Yes (*) /lib/x86_64-linux-gnu/libz.so.1
0x00007ffff7d020e0 0x00007ffff7d022e4 Yes (*) /lib/x86_64-linux-gnu/libecaltime-localtime.so
0x00007ffff475fcf0 0x00007ffff479c0ee Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rcl_interfaces/lib/librcl_interfaces__rosidl_typesupport_protobuf_c.so
0x00007ffff74ae560 0x00007ffff74b2952 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/builtin_interfaces/lib/libbuiltin_interfaces__rosidl_typesupport_protobuf_c.so
0x00007ffff46870d0 0x00007ffff46d043e Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/rcl_interfaces/lib/librcl_interfaces__rosidl_typesupport_protobuf_cpp.so
0x00007ffff7499560 0x00007ffff749d952 Yes (*) /home/florian/Projects/rmw_ecal/colcon_ws/install/builtin_interfaces/lib/libbuiltin_interfaces__rosidl_typesupport_protobuf_cpp.so

As you can see at the end (i.e. right before the crash), it loaded two libraries:

  1. libbuiltin_interfaces__rosidl_typesupport_protobuf_c.so
  2. libbuiltin_interfaces__rosidl_typesupport_protobuf_cpp.so

I then opened those two .so files with Notepad++ (yes, that's sketchy, but sufficient for searching for strings 😉):

Duration

Both so files that get loaded bring their own copy of Duration.proto compiled into them. And that's the issue. For some reason, that issue came with galactic. Older ROS versions apparently behaved different and didn't load both the C and CPP library.

How can it be solved

Essentially, we have to remove the Protobuf Objects from either one of the libraries. As both depend on them, It may be a good Idea to entirely move them into their own shared object file, that both the C and CPP typesupport will then use. So with this proposal we would have:

  • xxx_typesupport_protobuf.so << links the protobuf messages

which is then used by:

  • xxx_typesupport_protobuf_c.so
  • xxx_typesupport_protobuf_cpp.so

as the "main" typesupport_protobuf.so is a shared object file, it will only be loaded once for each process, even if both the C and CPP typesupport attempt to load it.

@FlorianReimold FlorianReimold changed the title Runtime error. Galactic + Protobuf typesupport: Runtime crash "File already exists in database" Mar 16, 2022
@sebastian-freitag
Copy link
Author

Thank you for the very detailed explanation. Can you keep this issue open, so that others see the discussion, please? Closed issues are harder to find with google ;). And maybe we find a proper solution (or someone manages a PR for ROS2 to fix it.. )

I already suspected something in that direction. I went in a different direction to work around it. I managed to hack a workaround and that is compile protobuf from source and hack it to disable the checks. I don't like that hack to become permanent but I though I share it still:

diff --git a/src/google/protobuf/descriptor_database.cc b/src/google/protobuf/descriptor_database.cc
index b101dd222..e9d4b138c 100644
--- a/src/google/protobuf/descriptor_database.cc
+++ b/src/google/protobuf/descriptor_database.cc
@@ -638,23 +638,30 @@ bool EncodedDescriptorDatabase::DescriptorIndex::AddFile(const FileProto& file,
                                EncodeString(file.name())}) ||
       std::binary_search(by_name_flat_.begin(), by_name_flat_.end(),
                          file.name(), by_name_.key_comp())) {
-    GOOGLE_LOG(ERROR) << "File already exists in database: " << file.name();
-    return false;
+    // GOOGLE_LOG(ERROR) << "File already exists in database: " << file.name();
+    // We just assume "everything is fine" when the file is already in database.
+    // That is NOT SAFE but a workaround for an ecal/ros2 issue.
+    // continental/rosidl_typesupport_protobuf#12
+    return true;
   }
 
diff --git a/src/google/protobuf/message.cc b/src/google/protobuf/message.cc
index 6f4069484..7e3b49292 100644
--- a/src/google/protobuf/message.cc
+++ b/src/google/protobuf/message.cc
@@ -250,9 +250,10 @@ GeneratedMessageFactory* GeneratedMessageFactory::singleton() {
 
 void GeneratedMessageFactory::RegisterFile(
     const google::protobuf::internal::DescriptorTable* table) {
-  if (!InsertIfNotPresent(&file_map_, table->filename, table)) {
-    GOOGLE_LOG(FATAL) << "File is already registered: " << table->filename;
-  }
+     InsertIfNotPresent(&file_map_, table->filename, table);
+  // if (!InsertIfNotPresent(&file_map_, table->filename, table)) {    
+  //   GOOGLE_LOG(FATAL) << "File is already registered: " << table->filename;
+  // }
+  // We just assume "everything is fine" when the file is already in database.
+  // That is NOT SAFE but a workaround for an ecal/ros2 issue.
+  // continental/rosidl_typesupport_protobuf#12
 }
 
 void GeneratedMessageFactory::RegisterType(const Descriptor* descriptor,

@ZhenshengLee
Copy link
Contributor

@brakmic-aleksandar brakmic-aleksandar transferred this issue from eclipse-ecal/rmw_ecal May 19, 2022
@brakmic-aleksandar
Copy link
Contributor

Moved issue to typesupport repo, since its typesupport related issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants