Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PvaClientChannel not connecting #50

Open
MattTaylorDLS opened this issue Mar 25, 2019 · 15 comments
Open

PvaClientChannel not connecting #50

MattTaylorDLS opened this issue Mar 25, 2019 · 15 comments

Comments

@MattTaylorDLS
Copy link

We're seeing some odd behaviour with the PvaClientChannel, where it sometimes fails to connect.

The code we are using is:

PvaClientChannel pvaChannel = pvaClient.createChannel(device.getName(), "pva");
pvaChannel.issueConnect();
Status status = pvaChannel.waitConnect(timeout);

It's only in one specific code route call chain that it sometimes fails, in others it seems to be ok, so there may be other things going on which are affecting it, however the failure seems to be in the pvaccess libraries.

Looking at tcpdump for the times it fails and the times it doesn't, it seems that in both cases, the PVA Client Search is sent, and the Server SEARCH_RESPONSE is sent in return. In the case where it does work, the client then sends out the CREATE_CHANNEL ok and the channel is created, but in the times where it doesn't work, the client doesn't send out the CREATE_CHANNEL.

See attached for tcpdump logs.

worked - a time when it worked. Client search is at number 6387 when looking in wireshark.
broke - a time when it didn't work. Client search is at number 3370 when looking in wireshark.
pause - the code call path that always works. Client search is at number 2532 when looking in wireshark.

Any help in diagnosing what's going on would be greatly appreciated.

tcpdumps.tar.gz

@mrkraimer
Copy link
Contributor

mrkraimer commented Mar 25, 2019 via email

@MattTaylorDLS
Copy link
Author

Hi Marty, thanks for responding.
I see this on the times it fails to connect:

PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01

Compared to this on the times it does work:

PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::channelStateChange channel BL99P-ML-SCAN-01 isConnected true
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01

@MattTaylorDLS
Copy link
Author

(We call pvaChannel.destroy() destroy after checking the result of 'Status' )

@mrkraimer
Copy link
Contributor

mrkraimer commented Mar 25, 2019 via email

@MattTaylorDLS
Copy link
Author

(I'm including the monitorEvent messages this time, as they intersect the channel connect calls. We are monitoring some endpoints on the same channel with a PvaClientChannel monitor() )

with waitConnect(timeout), we get:

PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
status StatusImpl [type=ERROR, message=channel not connected]
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent

And with waitConnect(0), we get:

PvaClientChannel::issueConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::issueConnect calling provider->createChannel
PvaClientChannel::channelCreated channel BL99P-ML-SCAN-01 connectState connectActive isConnected false status.isOK true
PvaClientChannel::waitConnect() channel BL99P-ML-SCAN-01
PvaClientChannel::destroy() channel BL99P-ML-SCAN-01
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
status StatusImpl [type=ERROR, stackDump=
java.lang.InterruptedException
	at org.epics.pvaClient.PvaClientChannel.waitConnect(PvaClientChannel.java:403)
	at org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.createAndCheckChannel(MalcolmEpicsV4Connection.java:165)
	at org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.sendCallMessage(MalcolmEpicsV4Connection.java:302)
	at org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.send(MalcolmEpicsV4Connection.java:103)
	at org.eclipse.scanning.malcolm.core.MalcolmDevice.call(MalcolmDevice.java:469)
	at org.eclipse.scanning.malcolm.core.MalcolmDevice.lambda$3(MalcolmDevice.java:462)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
]
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent
PvaClientMonitor::monitorEvent

@mrkraimer
Copy link
Contributor

mrkraimer commented Mar 25, 2019 via email

@MattTaylorDLS
Copy link
Author

Why is destroy being called immediately?
I've put some more logging in and i think this is the destroy for an earlier call. *

Can you the result of the return value from waitConnect?
I think that is being logged already, in the lines:

status StatusImpl [type=ERROR, message=channel not connected]
when there is a timeout. and
status StatusImpl [type=ERROR, stackDump= java.lang.InterruptedException at org.epics.pvaClient.PvaClientChannel.waitConnect(PvaClientChannel.java:403) at org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.createAndCheckChannel(MalcolmEpicsV4Connection.java:165) at org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.sendCallMessage(MalcolmEpicsV4Connection.java:302) at org.eclipse.scanning.connector.epics.MalcolmEpicsV4Connection.send(MalcolmEpicsV4Connection.java:103) at org.eclipse.scanning.malcolm.core.MalcolmDevice.call(MalcolmDevice.java:469) at org.eclipse.scanning.malcolm.core.MalcolmDevice.lambda$3(MalcolmDevice.java:462) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ]
when timeout is 0.

  • What we're doing is creating a channel to send the run() command (using RPCClientImpl.request()). And in a separate thread, we're sending the abort() command, but first we connect using the code on above to check that we can make a connection before sending the RPC request. It's this abort call that is failing. I'm just trying to find out why this other channel is being destroyed at this point as i don't think it should be.

@MattTaylorDLS
Copy link
Author

It is worth saying though that it looks like that other channel gets destroyed in the times it does work as well as the times it doesn't work

@mrkraimer
Copy link
Contributor

mrkraimer commented Mar 26, 2019 via email

@MattTaylorDLS
Copy link
Author

Ah i see. We used to do it that way, but we changed to using RPCClientImpl because that supported exceptions (getting exceptions back from the channel), but the pvaClient interface to channelRPC didn't. Am i correct in thinking that that is still the case?

@mrkraimer
Copy link
Contributor

mrkraimer commented Mar 26, 2019 via email

@MattTaylorDLS
Copy link
Author

Yes that might be what we need to do. What files should I look at? How big of a job is it?

@mrkraimer
Copy link
Contributor

mrkraimer commented Mar 26, 2019 via email

@MattTaylorDLS
Copy link
Author

Great, thanks very much

@MattTaylorDLS
Copy link
Author

I'm still trying to determine why we see the RPCClientImpl.request() exception with the message "timeout" even before we've attempted to create the other channel connection. It must be something we're doing somewhere up the stack but I'm not overly familiar with that code, but thanks for all of your help again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants