-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread-safety of disconnecting slots #23
Comments
Thank you for the detailed analysis, I agree with your assessment that using scoped_connections in multi-threaded context requires careful planning, and that the current documentation does not offer enough guidance and/or warnings. I actually invoked the slots under lock in the past, but it caused thread-safety concerns, more precisely it made deadlocks possible. One way of mitigating the problem may be to copy a list of weak references to the actual slot objects, but it does not fix it entirely. In your use case, Thread 1 may start invocation of the lambda while reset() is being called from Thread 2. The best way to protect against such behavior is to rely on slots tracking feature offered by the library. Here is a worked example that shows how to do it: #include <atomic>
#include <iostream>
#include <memory>
#include <thread>
#include <sigslot/signal.hpp>
void thread1(sigslot::signal<> &sig) {
const auto deadline = std::chrono::steady_clock::now()
+ std::chrono::milliseconds(200);
while (std::chrono::steady_clock::now() < deadline) {
for (int i = 0; i < 1000; ++i) {
sig();
}
}
}
struct Observer {
static auto create(sigslot::signal<> &signal) {
auto obs = std::make_shared<Observer>();
signal.connect(&Observer::slot, obs);
return obs;
}
std::atomic<int> state{};
private:
void slot() {
state++;
}
};
int main() {
sigslot::signal<> sig;
auto observer = Observer::create(sig);
std::thread t(thread1, std::ref(sig));
std::this_thread::sleep_for(std::chrono::milliseconds(100));
int v = observer->state;
observer.reset();
std::cout << "Got " << v << std::endl;
t.join();
} The main selling point of this approach is that the lifetime of the Observer gets extended via a weak_ptr locking while the slot is being invoked. This has the obvious advantage to avoid heavy locking. Would that be an acceptable solution to you? The design and thread safety of |
Thank you for your reply. The Could you elaborate about the possible deadlocks when calling the slots under the lock? We are currently running sigslot with the following patch: @@ -686,10 +686,8 @@ public:
if (m_block)
return;
+ lock_type lock(m_mutex);
+
// copy slots to execute them out of the lock
cow_copy_type<list_type, Lockable> copy = slots_copy();
@@ -923,7 +921,7 @@ using signal_st = signal_base<detail::null_mutex, T...>;
* Recursive signal emission and emission cycles are supported too.
*/
template <typename... T>
-using signal = signal_base<std::mutex, T...>;
+using signal = signal_base<std::recursive_mutex, T...>;
} // namespace sigslot This fixed the race condition for us and works as expected so far. Regarding |
Sure, your patch triggers a deadlock right away in the test case dedicated to detect this problem. Do you have any hard constraints on memory or CPU usage for avoiding shared_ptr? It is not super cheap, but it could also be argued that both keeping the lock longer and using a recursive mutex offset the cost of the shared_ptr. I will look for a way to improve lifetime management. |
Thank you, I did not think about such complicated situations for emitting signals. In our use case, the signals are always emitted from the same thread, so this cannot happen. We don't want to avoid |
Yes, thread-safe signal emission is trickier than it appears. One idea I can think of would be to add a lightweight spin mutex per slot and lock it before slot invocation and slot destruction. That would be cheap most of the time because contention would be quite small. With a normal spin mutex, this would also make slot callable thread-safe as a side effect, otherwise maybe a RW spin mutex could be designed to allow concurrent invocation of slots. |
So, I have come up with a preliminary design for this that gives strong safety guarantees for connections and lambdas in multi-threaded contexts. It should basically allow you write the code as you expected it. I pushed it on the safe-observer branch. Would you mind testing it to see if it solves your problems? |
My first implementation was not satisfactory, I pushed a second proof of concept that should work as expected. I added a couple of tests to check the behaviour. |
Thank you for creating the branch and sorry for the delay. We were busy with other things, but tested your changes now. There seems to be a problem with the The implementation actually seems to work for our problem, though. We also added a sigslot test with our exact situation and it passes on the branch (and does not pass on master): static void test_disconnect_while_slot_is_running() {
sigslot::signal<> sig;
int shared_state = 1;
auto slot = [&shared_state]() {
shared_state = 1;
std::this_thread::sleep_for(std::chrono::milliseconds(200));
assert(shared_state != 0);
};
sigslot::scoped_connection conn = sig.connect(slot);
auto emit = [&sig]() {
sig();
};
std::thread emitter(emit);
std::this_thread::sleep_for(std::chrono::milliseconds(20));
assert(shared_state != 0);
conn.disconnect();
shared_state = 0;
emitter.join();
} |
I think that the issue exists also in #include <future>
#include <atomic>
#include <gtest/gtest.h>
#include <sigslot/signal.hpp>
namespace {
class SigslotObserver : public sigslot::observer {
private:
std::promise<void> _called_prom;
std::atomic<bool> _called{false};
std::atomic<bool> &_executing;
public:
SigslotObserver(std::atomic<bool> &executing) :
_executing(executing)
{}
~SigslotObserver() {
disconnect_all();
}
void slot(const int &msg) {
using namespace std::literals::chrono_literals;
_executing = true;
if (!_called) {
_called_prom.set_value();
_called = true;
}
std::this_thread::sleep_for(2ms);
_executing = false;
}
std::future<void> fut() {
return _called_prom.get_future();
}
};
}
TEST(SigslotObserver, SlotNotCalledAfterObserverIsDestroyed) {
using namespace std::literals::chrono_literals;
sigslot::signal<int> signal;
std::atomic<bool> keep_signalling = true;
std::thread feeder{[&](){
while (keep_signalling) {
signal(1);
}
// make a last call to the router
signal(1);
}};
std::atomic<bool> executing = false;
{
SigslotObserver obs{executing};
signal.connect(&SigslotObserver::slot, &obs);
EXPECT_EQ(obs.fut().wait_for(100ms), std::future_status::ready);
}
// here, the obeserver is out of scope, and shall be destroyed and disconnected
// if the slot is still executing, this will have bad consequences, since it could
// access member variables which are now out of scope
EXPECT_EQ(executing, false);
// we can now stop signaling and join the background thread
keep_signalling = false;
feeder.join();
} |
The test above does pass on the safe-observer branch. |
Are there any plans of merging this branch at some point in the future? |
Yes I consider this feature to be essential for the usability of the library. I am however dissatisfied with the current implementation. Deferred destruction is a kind of garbage collection, which is not easy to get both right and cheap. This attempt relies on atomic ref-counting and spin waiting on disconnection, which might be disastrous for some use cases and also incurs a non negligible cost on every slot invocation. I would like to implement a better mechanism to mitigate the cost of tracking usages of a slot and read about deferred ref-counting and other strategies, but those are non trivial to implement and have not found the time to get to it yet. |
Interesting. I did discuss the I assume that a solution with a (recursive) mutex and (mayby a) condition variable would impose too much of a performance penalty, if you consider the atomic reference counting also too expensive. |
Yes mutexes are not an option. I have a few ideas on how this could be solved, I will try to get to it soonish. |
Hi, any updates on this? |
Sorry for the delay I have been busy on other things. I absolutely want to fix this, but this not as easy as it looks. The proof of concept of the safe-observer branch is not good enough and fails on some corner cases. The way I tried to fix this is basically equivalent to implementing a poor man's rc-based garbage collector, which is completely out of the scope of the project. I will look at from some other angle. |
Take the following example code:
I thought this should be thread-safe since
Observer
is built in such a way that theconnection
goes out of scope before the shared state (observer instance and itsstate
member) of the slot gets destroyed.It is not, though, since (1) creates a copy of the list of slots at the beginning of the signal emission. The disconnect triggered by (2) then happens asynchronously and the callback can potentially be executed after the observer got destroyed.
signal::operator()
documents that thread-safety only applies to the signal itself, but it seems like this makes it very hard to manage the lifetime of the observer. The lifetime of the shared state cannot be bound to the lifetime of thescoped_connection
, but must be coupled to the lifetime of the lambda.signal::operator()
and keeps the mutex locked while the slots are running. I don't care about the additional locking in my application, but this would make managing lifetimes of the callbacks a lot easier.sigslot::observer
class provided by this library. The example for that class uses global state, so it doesn't have this problem. But I think this should be documented since using theobserver
class like in my example seems very natural.The text was updated successfully, but these errors were encountered: