-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix/unregistration memfile event handling #1214
bugfix/unregistration memfile event handling #1214
Conversation
…when creating/destroying subscriber dynamically for the same topic
…nts are closed on CSyncMemFile Destruction/Disconnect) added empty AddLocConnection/AddExtConnection RemLocConnection/RemExtConnection for all layers as unimplemented base functions (ecal_writer_base.h)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, a lot of questions (concerning) the original design, still remain.
I find it a bit questionable, to keep all signalling resources open until the memory file itself is destroyed.
Couldn't it lead to the same problems on subscriber side, e.g. if the subscriber side destroys the events (e..g after 60 second timeout) and then creates a new subscriber?
(E.g. now we made sure events are not closed on sender side, because in the testcase they stayed open on subscriber side. What if they are closed/reopened on subscriber side... in that case communication can't be established, either?)
It feels to me, like we might have just shifted the problem.
Maybe it's worth to take a look at the underlying NamedEvents.
virtual bool AddExtConnection(const std::string& /*host_name_*/, const std::string& /*process_id_*/, const std::string& /*conn_par_*/) { return false; }; | ||
virtual bool RemExtConnection(const std::string& /*host_name_*/, const std::string& /*process_id_*/) { return false; }; | ||
virtual bool AddExtConnection(const std::string& /*host_name_*/, const std::string& /*process_id_*/, const std::string& /*topic_id_*/, const std::string& /*conn_par_*/) { return false; }; | ||
virtual bool RemExtConnection(const std::string& /*host_name_*/, const std::string& /*process_id_*/, const std::string& /*topic_id_*/) { return false; }; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the function signatures? Are the arguments actually being used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not used yet, but IF any layer wants to intercept that information it makes sense to provide host_name, process_id, topic_id to identify exactly the connection the is added or removed. The old signature supported just SHM with the needed information process_id.
{ | ||
m_writer.shm.RemLocConnection(loc_sub.process_id); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the "goodbye" message but happens on registration timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's not the goodbye message and no more needed at all. It closed the connected events for the memory file that was connected to this shm writer. This is now done since many version in the destruction of the CSyncMemFile.
|
||
return ret_state; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit wrong that connections cannot be removed again. So the resources stay open until the publisher is closed? Is that what happens? Also they continue being signalled, etc?
And btw, I guess you have addd line 206 to prevent the testcase to fail. see the huge comment. Maybe it was just not working at all on Posix...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's no more signalling, it is just holding the resources for MemFile + Events. The publisher (writer) knows the number of connections local/extern all the time.
// currently all these functions are not implemented (there is no need to react on) | ||
m_writer.udp_mc.RemLocConnection (local_info_.process_id, local_info_.topic_id); | ||
m_writer.shm.RemLocConnection (local_info_.process_id, local_info_.topic_id); | ||
m_writer.tcp.RemLocConnection (local_info_.process_id, local_info_.topic_id); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the tcp RemLocConnection
/ AddExtConnection
should preferably be done in a different PR as it is not related to the issue that is being fixed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will mark this PR as a cleanup and we do the fix in another one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it doesn't work in the case that the subscriber is calling finalize()
and initialize()
mixed up bugfixes and code cleanups + did not really fix the issue |
Description
The new introduced "active register/unregister logic" calls depending on the connection type specific functions like (AddLocConnection/RemLocConnection,AddExtConnection/RemExtConnection) for the SHM and UDP data writer (TCP was just not handled for now).
The RemLocConnection was not called in that context before the new unregister/register logic and is closing the 2 connection events for updating and acknowledging memory file data transfer/receive. This leads to a dead shm connection when the subscriber on the other side is destroyed (triggering RemLocConnection in the publishing process) and created again.
Solution: Events are not closed on that level but are cleaned up together with the memory file resource (CSyncMemFile) properly.
Related issues
No issue now, will be created soon ..
Cherry-pick to