-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rxe failed connectivity test #51
Comments
Hello. BTW : are you working with a Mellanox's HCA ? or some ethernet NIC like Intel or Broadcom ? |
Hi @yonatanco , thanks so much for your responding! I'm trying 4.8-rc5 now, later I'll send you my feedbacks. BTW, here's my hardware info:
|
Oops. still got the same problem:
|
you are using gid 0. try with gid 1.
|
Thanks for reminding, @yonatanco . The result stays same..
|
On 9/12/2016 2:12 PM, Chang Lou wrote:
|
@yonatanco , sorry to reply late.. I didn't receive notification on the main page. |
Hello ,mcfatealan As you mentioned the Loopback issue, I aslo tested this case between two PC. One is Linux, the other is VM(NAT connection). We run rping server on PC, but when run client on the VM, the server is crashed, no response for any action. You said that you have try the 4.8-rc5. I want to know how you achieve it, use rxe-dev branch or just upgrade the kernel? I want to continue the testing , thanks! Best Regards |
Hi @anthonyliubin , I'm sorry to hear that you've had the same issue. The thing is that unluckily I still didn't pass the test in the end. My purpose was to find a temporary solution to test my RDMA codes before our server was fixed. The time spent on this project exceeded my expected limits, so I had to give up. But still I'd like to thank @yonatanco for all of his help! About 4.8-rc5, I just upgraded my kernel. It's kinda embarrassing that my answer might not provide any help. Anyway, that's all I know. Hope for the best! |
hi, @mcfatealan Best Regards |
I'm not 100% sure since it's been a while, but according to the description of @yonatanco , seems that rxe-dev already included in 4.8.0? I suggest you give a try :) |
hi, @mcfatealan Best Regards |
@anthonyliubin congrats! so glad to hear that :) The points you mentioned are very helpful. Maybe I will test again next time according to your experience. |
Hi, In case of rping I get:
Should this work loopback/on the same machine or is this unsupported? |
hi, @oTTer-Chief In my testing, it could not work loopback/on the same machine. Best Regards |
Hi @anthonyliubin , I tried testing between 2 VMs and this worked. |
Hi all, Communication with the same machine is also required with GlusterFS RDMA transport...(which I was not able to do with Linux 4.9) |
you may try this: |
Is this nesessary also if firewall is disabled? |
The default firewall rule is rejecting the unknown connection, and the direct test will be rejected by the remote firewall |
Any updates? Seems the loopback interface is not functioning for RDMACM, which is crucial for testing and local development. |
The RXE project maintenance in Github was stopped. You should move to upstream linux for kernel module and rdma-core (https://github.com/linux-rdma/rdma-core) for userspace library to get the latest features and bug fixes. |
Thanks for your comment! |
@Hunter21007 I also tried GlusterFS RDMA transport with kernel 4.9.0. what do you mean same machine? I have 2 VMs 2x NIC .1x NAT 1x host-only.did you get glusterfs rdma transport running with soft-roce? |
@githubfoam According to my inquiry in the linux-rdma mailing list, several RXE bugs were fixed in 4.9/10/11, and you are suggested to upgrade to 4.14/15 (e.g. Ubuntu 18.04 or Debian unstable). If the problem persists, let us know. |
@githubfoam Host only means glusterfs server and client on same machine via 127.0.0.1. No I was not successful to make it work. And now it is even out of scope. Because glusterfs rdma support was dropped. So this one is not relevant anymore. |
@Hunter21007 could you provide a link that shows glusterfs-rdma support is dropped ? Over here site suggests two links.However both links end nowhere. |
@byronyi I tried what's suggested on this website.My nodes are configured with "Ubuntu 16.04.4 LTS-Linux 4.7.0-rc3+" after installing kernel/user spaces. I can't play ping pong.rxe testing fails. I dont get how you upgrade to 4.14-15 kernels.With this kernel spaces it is upgraded from "4.4.0-116-generic" to " 4.7.0-rc3+" |
I am able to do ping pong with rxe but rdma_cm is failing when it comes to gluster-rdma support. the port 24008 is never started due to rdma_cm fails with [No Device Found] |
@ lalith-b if you read whole thread glusterfs rdma was dropped at that time.If you have information that says otherwise could you please share? The point I left was TCP worked but RDMA did not |
hi @monis410 , I'm a RDMA beginner. I met a problem very similar to the previous issue (#49).
I walk around it by moving /usr/lib64/* to /usr/lib/. But after that I have problems on connectivity tests.
My OS is Ubuntu 16.04 LTS (4.7.0-rc3+).
Some of my test result:
Then I tested connectivity both on one machines(self-to-self), and on one physical machine and a virtual machine. I can ping each other, so the connectivity of these machines is fine. The test result is exactly the same.
Could you help me have a look on it? Thank you so much!
BTW, could Soft RoCE work well with python-rdma (https://github.com/jgunthorpe/python-rdma)? I tested that too and failed, not sure whether both two problems share the same root.
The text was updated successfully, but these errors were encountered: