-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PvaClientChannel not connecting #50
Comments
Let me give a suggestion that might make it easier to show the problem.
Before any other call to any pvaClient code issue the statement:
PvaClient.setDebug(true);
What do you see?
Marty
…On 3/25/19 4:15 AM, Matthew Taylor wrote:
We're seeing some odd behaviour with the PvaClientChannel, where it
sometimes fails to connect.
The code we are using is:
PvaClientChannel pvaChannel =
pvaClient.createChannel(device.getName(), "pva");
pvaChannel.issueConnect();
Status status = pvaChannel.waitConnect(timeout);
It's only in one specific code route call chain that it sometimes
fails, in others it seems to be ok, so there may be other things going
on which are affecting it, however the failure seems to be in the
pvaccess libraries.
Looking at tcpdump for the times it fails and the times it doesn't, it
seems that in both cases, the PVA Client Search is sent, and the
Server SEARCH_RESPONSE is sent in return. In the case where it does
work, the client then sends out the CREATE_CHANNEL ok and the channel
is created, but in the times where it doesn't work, the client doesn't
send out the CREATE_CHANNEL.
See attached for tcpdump logs.
worked - a time when it worked. Client search is at number 6387 when
looking in wireshark.
broke - a time when it didn't work. Client search is at number 3370
when looking in wireshark.
pause - the code call path that always works. Client search is at
number 2532 when looking in wireshark.
Any help in diagnosing what's going on would be greatly appreciated.
tcpdumps.tar.gz
<https://github.com/epics-base/epicsCoreJava/files/3002149/tcpdumps.tar.gz>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#50>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AF1Q1nds72JEQb9rWyKFmLt-wJvtuIRDks5vaIWXgaJpZM4cGTDt>.
|
Hi Marty, thanks for responding.
Compared to this on the times it does work:
|
(We call pvaChannel.destroy() destroy after checking the result of 'Status' ) |
This helps.
Also what happens when you specify :
Status status = pvaChannel.waitConnect(timeout);
System.out.println("status " + status);
and
Status status = pvaChannel.waitConnect(0);
System.out.println("status " + status);
I am starting to agree with your suspicion that it is a pvAccess problem.
Marty
…On 3/25/19 7:54 AM, Matthew Taylor wrote:
Hi Marty, thanks for responding.
I see this on the times it fails to connect:
|PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState
connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01 |
Compared to this on the times it does work:
|PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState
connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::channelStateChange channel BL99P-ML-SCAN-01
isConnected true PvaClientChannel::destroy() channel BL99P-ML-SCAN-01 |
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#50 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF1Q1phWKLM9DA3jFZSyL1nodAAoJhprks5vaLjegaJpZM4cGTDt>.
|
(I'm including the monitorEvent messages this time, as they intersect the channel connect calls. We are monitoring some endpoints on the same channel with a PvaClientChannel monitor() ) with waitConnect(timeout), we get:
And with waitConnect(0), we get:
|
The following is strange:
And with waitConnect(0), we get:
|PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState
connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01|
Why is destroy being called immediately?
It should wait forever since the code in waitConnect is:
try {
if(timeout>0.0) {
long nano = (long)(timeout*1e9);
waitForConnect.awaitNanos(nano);
} else {
waitForConnect.await();
}
} catch(InterruptedException e) {
Status status =
statusCreate.createStatus(StatusType.ERROR,e.getMessage(),
e.fillInStackTrace());
return status;
}
Can you the result of the return value from waitConnect?
That is
Status status = pvaChannel.waitConnect(0);
System.out.println("status " + status);
Also try:
Status status = pvaChannel.waitConnect(10);
System.out.println("status " + status);
Marty
…On 3/25/19 10:47 AM, Matthew Taylor wrote:
(I'm including the monitorEvent messages this time, as they intersect
the channel connect calls. We are monitoring some endpoints on the
same channel with a PvaClientChannel monitor() )
with waitConnect(timeout), we get:
|PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState
connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent status
StatusImpl [type=ERROR, message=channel not connected]
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent |
And with waitConnect(0), we get:
|PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState
connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent status
StatusImpl [type=ERROR, stackDump= java.lang.InterruptedException at
org.epics.pvaClient.PvaClientChannel.waitConnect(PvaClientChannel.java:403)
at
org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.createAndCheckChannel(MalcolmEpicsV4Connection.java:165)
at
org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.sendCallMessage(MalcolmEpicsV4Connection.java:302)
at
org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.send(MalcolmEpicsV4Connection.java:103)
at
org.eclipse.scanning.malcolm.core.MalcolmDevice.call(MalcolmDevice.java:469)
at
org.eclipse.scanning.malcolm.core.MalcolmDevice.lambda$3(MalcolmDevice.java:462)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) ]
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent PvaClientMonitor::monitorEvent |
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#50 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF1Q1rtEucUem6zcvQk2klm0e_ugzEfzks5vaOGOgaJpZM4cGTDt>.
|
|
It is worth saying though that it looks like that other channel gets destroyed in the times it does work as well as the times it doesn't work |
On 3/25/19 11:51 AM, Matthew Taylor wrote:
Why is destroy being called immediately?
I've put some more logging in and i think this is the destroy for
an earlier call. *
Can you the result of the return value from waitConnect?
I think that is being logged already, in the lines:
|status StatusImpl [type=ERROR, message=channel not connected]|
when there is a timeout. and
|status StatusImpl [type=ERROR, stackDump=
java.lang.InterruptedException at
org.epics.pvaClient.PvaClientChannel.waitConnect(PvaClientChannel.java:403)
at
org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.createAndCheckChannel(MalcolmEpicsV4Connection.java:165)
at
org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.sendCallMessage(MalcolmEpicsV4Connection.java:302)
at
org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.send(MalcolmEpicsV4Connection.java:103)
at
org.eclipse.scanning.malcolm.core.MalcolmDevice.call(MalcolmDevice.java:469)
at
org.eclipse.scanning.malcolm.core.MalcolmDevice.lambda$3(MalcolmDevice.java:462)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) ]|
when timeout is 0.
* What we're doing is creating a channel to send the run() command
(using RPCClientImpl.request()). And in a separate thread, we're
sending the abort() command, but first we connect using the code
on above to check that we can make a connection before sending the
RPC request. It's this abort call that is failing. I'm just trying
to find out why this other channel is being destroyed at this
point as i don't think it should be.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#50 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF1Q1vr03uaRR-V8KIXQCxZeSvViOYTlks5vaPCFgaJpZM4cGTDt>.
I think I may see the problem.
You said that all your pvaccess work is done from a single set of
threads, i. e. under single process.
But you are mixing pvaClient with RPCClientImpl.
Each calls org.epics.pvaccess.ClientFactory.start()
Instead of using RPCClientImpl use the pvaClient interface to channelRPC.
For an example see:
Marty
|
Ah i see. We used to do it that way, but we changed to using RPCClientImpl because that supported exceptions (getting exceptions back from the channel), but the pvaClient interface to channelRPC didn't. Am i correct in thinking that that is still the case? |
On 3/26/19 7:54 AM, Matthew Taylor wrote:
Ah i see. We used to do it that way, but we changed to using
RPCClientImpl because that supported exceptions (getting exceptions
back from the channel), but the pvaClient interface to channelRPC
didn't. Am i correct in thinking that that is still the case?
Probably true.
For at least the last year I have spent almost all my time on C++ code.
What to do?
Do you want to look at changing the pvaClient interface to channelRPC
so that it supports exceptions from the channel?
Marty
|
Yes that might be what we need to do. What files should I look at? How big of a job is it? |
On 3/26/19 7:54 AM, Matthew Taylor wrote:
Ah i see. We used to do it that way, but we changed to using
RPCClientImpl because that supported exceptions (getting exceptions
back from the channel), but the pvaClient interface to channelRPC
didn't. Am i correct in thinking that that is still the case?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#50 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF1Q1lNkkH7eJWyXSAsvBqyK4Toxthmvks5vagp1gaJpZM4cGTDt>.
I looked a little more.
The same problem exists in pvaClientCPP.
I am going to create an issue in both pvaClientCPP and pcaClientJava.
Marty
|
Great, thanks very much |
I'm still trying to determine why we see the RPCClientImpl.request() exception with the message "timeout" even before we've attempted to create the other channel connection. It must be something we're doing somewhere up the stack but I'm not overly familiar with that code, but thanks for all of your help again |
We're seeing some odd behaviour with the PvaClientChannel, where it sometimes fails to connect.
The code we are using is:
PvaClientChannel pvaChannel = pvaClient.createChannel(device.getName(), "pva");
pvaChannel.issueConnect();
Status status = pvaChannel.waitConnect(timeout);
It's only in one specific code route call chain that it sometimes fails, in others it seems to be ok, so there may be other things going on which are affecting it, however the failure seems to be in the pvaccess libraries.
Looking at tcpdump for the times it fails and the times it doesn't, it seems that in both cases, the PVA Client Search is sent, and the Server SEARCH_RESPONSE is sent in return. In the case where it does work, the client then sends out the CREATE_CHANNEL ok and the channel is created, but in the times where it doesn't work, the client doesn't send out the CREATE_CHANNEL.
See attached for tcpdump logs.
worked - a time when it worked. Client search is at number 6387 when looking in wireshark.
broke - a time when it didn't work. Client search is at number 3370 when looking in wireshark.
pause - the code call path that always works. Client search is at number 2532 when looking in wireshark.
Any help in diagnosing what's going on would be greatly appreciated.
tcpdumps.tar.gz
The text was updated successfully, but these errors were encountered: