Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jabber response is not delivered to recipient if sendnig thread is the same that was listened before #99

Open
KotM opened this issue Sep 20, 2018 · 5 comments

Comments

@KotM
Copy link

KotM commented Sep 20, 2018

We are using Jabber as interprocess communication system between different part of our application. We have file explorer addin and Agent. The Addin communicates with the Agent to get necessary information.

Recently we have encountered with the issue, when the addin was unable to get Jabber response from Agent, while the Agent reported that the response was sent, and prosody logs confirmed that.

I have discovered curious dependency from our logs: if Jabber request was sent from the same thread that was received the response before, it is unable to receive the response. You can see what I mean from the logs below:

20 Sep 2018 15:15:52,970 [1] DEBUG: !!SendPayload send 
20 Sep 2018 15:15:52,994 [19] DEBUG: !!SendPayload received 
20 Sep 2018 15:15:53,133 [23] DEBUG: !!SendPayload send 
20 Sep 2018 15:15:53,158 [19] DEBUG: !!SendPayload received 
20 Sep 2018 15:15:56,031 [1] DEBUG: !!SendPayload send 
20 Sep 2018 15:15:56,038 [19] DEBUG: !!SendPayload received 
20 Sep 2018 15:15:56,041 [1] DEBUG: !!SendPayload send 
20 Sep 2018 15:15:56,062 [19] DEBUG: !!SendPayload received 
20 Sep 2018 15:15:56,213 [19] DEBUG: !!SendPayload send 

As you can see, when thread 19 was attempted to send the request, it was not able to receive the response.

Truncated part of our code operating with Jabber:

        protected TResponseType SendPayload<TRequestType, TResponseType>(TRequestType request)
            where TRequestType : class
            where TResponseType : class
        {
            var resetEvent = new AutoResetEvent(false);

            try
            {
                Context.Current.Logger.DebugFormat("!!SendPayload send {0}", request);

                Exception ex = null;
                var response = default(TResponseType);
                SendPayload(request, (state, iq) =>
                {
                    try
                    {
                        if (iq.Type.HasFlag(IQType.error))
                        {
                            ex = CreateAndLogException(iq);
                            Context.Current.Logger.ErrorFormat("Error Request: {0}", request.ToString());
                            return;
                        }
                        response = Payload.Get<TResponseType>(iq);
                    }
                    catch (Exception innerEx)
                    {
                        ex = innerEx;
                    }
                    finally
                    {
                        resetEvent.Set();
                        Context.Current.Logger.DebugFormat("!!SendPayload received {0}", request);
                    }
                });

                var token = _cancelTokenSource.Token;
                var waitResult = WaitHandle.WaitAny(new WaitHandle[] { resetEvent, token.WaitHandle }, ServiceTimeout);
                }
......................

            }
            catch (Exception ex)
            {
                Context.Current.Logger.Error("SendPayload Exception.", ex);
                throw;
            }
            finally
            {
                resetEvent.Close();
            }
        }

The stack trace stops on WaitHandle.WaitAny() command.

Actually, I'm not sure whether it is our application issue, or Jabber configuration issue, or Jabber code issue. Any suggestions are welcome.

P.S. We didn't get any errors with that code before. It was occurred when we tried to add new functionality into the Addin. Unfortunately, we don't control Addin threading, just file explorer does it. I'm trying to resolve if from Addin side (currently without success), but I hope you can provide any ideas what is happened here...

@ForNeVeR
Copy link
Member

ForNeVeR commented Sep 20, 2018

Yeah, I'm aware of gross threading issues in Jabber-Net and I'm trying to fix them whenever I can. It is likely that your issue is caused by Jabber-Net code.

The current situation is that I could really use some help on this kind of issues. They are extremely hard to reproduce and are extremely hard to fix without breaking anything :(

For now, I'm mainly working on #95 and it required a big rework of our internals. If you could debug your issue and provide any local "fix" for it, I think I'd merge that for now.

Unit tests are much appreciated!

@ForNeVeR
Copy link
Member

Could you please also provide information about what version are you using?

@KotM
Copy link
Author

KotM commented Sep 20, 2018

It was reproduced on old version 2.1.0.710, but before creating the issue I was upgraded my project to 3.0.0.37668, and it was reproduced anyway.

Btw, we're continue using of old 2.1.0 due to unreleased critical (for us) issue. Just waiting for merging and releasing. :)

@KotM
Copy link
Author

KotM commented Sep 20, 2018

Regarding to the help from my side: unfortunately I can't provide an example, because it is part of enterprise solution, and file explorer addin uses licensed component which I can't share.

But I can help with testing of any versions of JabberNet for that case.

@ForNeVeR
Copy link
Member

Yes, I understand that #95 is critical pretty much for everyone. It is very painful to fix, but I'm working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants