You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have two machines with normal NIC adapter on it. On machine acting as NVMe-Host and the other machine NVMe-Target. Target is NULL_BLOCK_DEVICE provided by linux. Discovery/Connect NVMe commands are working fine. Data transfer is happening fine through the Soft-RoCE interface.
When tried running IO's [Read] using fio command, NVMe-Host tries to re-connect to the target and then kernel panic happens. Stack trace shows the error in rdma_disconnect().
Hi All,
Have two machines with normal NIC adapter on it. On machine acting as NVMe-Host and the other machine NVMe-Target. Target is NULL_BLOCK_DEVICE provided by linux. Discovery/Connect NVMe commands are working fine. Data transfer is happening fine through the Soft-RoCE interface.
When tried running IO's [Read] using fio command, NVMe-Host tries to re-connect to the target and then kernel panic happens. Stack trace shows the error in rdma_disconnect().
Below is the stack trace when panic happened.
Sep 16 16:40:44 john kernel: [ 4660.937003] nvme nvme0: rdma_resolve_addr wait failed (-104).
Sep 16 16:40:53 john kernel: [ 4669.289136] rxe: set rxe0 active
Sep 16 16:40:53 john kernel: [ 4669.289138] rxe: added rxe0 to eno1
Sep 16 16:40:53 john kernel: [ 4669.291500] interface en01 not found
Sep 16 16:41:03 john kernel: [ 4679.172136] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.0.154:1023
Sep 16 16:41:05 john kernel: [ 4681.896008] nvme nvme0: creating 4 I/O queues.
Sep 16 16:41:05 john kernel: [ 4681.928447] nvme nvme0: new ctrl: NQN "testsubsystem", addr 192.168.0.154:1023
[ 5128.118832] blk_update_request: I/O error, dev nvme0n1, sector 664872
[ 5128.125658] blk_update_request: I/O error, dev nvme0n1, sector 1614312
[ 5128.132569] blk_update_request: I/O error, dev nvme0n1, sector 1309672
[ 5128.139307] blk_update_request: I/O error, dev nvme0n1, sector 1240976
Sep 16 16:48:32 [ 5128.146346] blk_update_request: I/O error, dev nvme0n1, sector 2037616
john kernel: [ 5[ 5128.154293] blk_update_request: I/O error, dev nvme0n1, sector 450352
128.118832] blk_[ 5128.162782] blk_update_request: I/O error, dev nvme0n1, sector 1719776
update_request: [ 5128.170989] blk_update_request: I/O error, dev nvme0n1, sector 441656
I/O error, dev n[ 5128.178936] blk_update_request: I/O error, dev nvme0n1, sector 668736
vme0n1, sector 6[ 5128.187821] blk_update_request: I/O error, dev nvme0n1, sector 1249384
64872
Sep 16 16:48:32 john kernel: [ 5128.125658] blk_update_request: I/O error, dev nvme0n1, sector 1614312
Sep 16 16:48:32 john kernel: [ 5128.132569] blk_update_request: I/O error, dev nvme0n1, sector 1309672
Sep 16 16:48:32 john kernel: [ 5128.139307] blk_update_request: I/O error, dev nvme0n1, sector 1240976
Sep 16 16:48:32 john kernel: [ 5128.146346] blk_update_request: I/O error, dev nvme0n1, sector 2037616
Sep 16 16:48:32 john kernel: [ 5128.154293] blk_update_request: I/O error, dev nvme0n1, sector 450352
Sep 16 16:48:32 john kernel: [ 5128.162782] blk_update_request: I/O error, dev nvme0n1, sector 1719776
Sep 16 16:48:32 john kernel: [ 5128.170989] blk_update_request: I/O error, dev nvme0n1, sector 441656
Sep 16 16:48:32 john kernel: [ 5128.178936] blk_update_request: I/O error, dev nvme0n1, sector 668736
Sep 16 16:48:32 john kernel: [ 5128.187821] blk_update_request: I/O error, dev nvme0n1, sector 1249384
Sep 16 16:48:32 john kernel: [ 5128.195526] nvme nvme0: reconnecting in 10 seconds
[ 5149.206030] nvme nvme0: failed nvme_keep_alive_end_io error=16391
Sep 16 16:48:53 john kernel: [ 5149.206030] nvme nvme0: failed nvme_keep_alive_end_io error=16391
[ 5198.356270] nvme nvme0: Connect command failed, error wo/DNR bit: 7
Sep 16 16:49:42 john kernel: [ 5198.356270] nvme nvme0: Connect command failed, error wo/DNR bit: 7
Sep 16 16:49:42 john kernel: [ 5198.362922] nvme nvme0: Failed reconnect attempt, requeueing...
Sep 16 16:49:53 john kernel: [ 5209.619737] nvme nvme0: rdma_resolve_addr wait failed (-110).
Sep 16 16:49:53 john kernel: [ 5209.620031] nvme nvme0: Failed reconnect attempt, requeueing...
[ 5219.859419] general protection fault: 0000 [#1] SMP
[ 5219.864479] Modules linked in: rdma_ucm ib_uverbs nvme_rdma(OE) rdma_cm iw_cm ib_cm configfs nvme_fabrics(OE) nvme_core(OE) rdma_rxe ip6_udp_tunnel udp_tunnel ib_core binfmt_misc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hda_core kvm snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq gpio_ich joydev snd_seq_device input_leds snd_timer snd irqbypass mei_me serio_raw mei soundcore lpc_ich mac_hid parport_pc ppdev lp parport autofs4 i915 hid_microsoft hid_generic i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops e1000e psmouse usbhid ptp hid drm pps_core pata_acpi fjes video
[ 5219.931164] CPU: 3 PID: 4130 Comm: kworker/3:0 Tainted: G OE 4.8.0-rc1+ #1
[ 5219.939458] Hardware name: /DH55TC, BIOS TCIBX10H.86A.0037.2010.0614.1712 06/14/2010
[ 5219.949302] Workqueue: nvme_rdma_wq nvme_rdma_reconnect_ctrl_work [nvme_rdma]
[ 5219.956929] task: ffff8d0d2b8b4240 task.stack: ffff8d0d87ab8000
[ 5219.963223] RIP: 0010:[] [] rdma_disconnect+0x2e/0x90 [rdma_cm]
[ 5219.972958] RSP: 0018:ffff8d0d87abbdb0 EFLAGS: 00010206
[ 5219.978541] RAX: 6e5f656572745f88 RBX: ffff8d0d34914400 RCX: 0000000000000001
[ 5219.986052] RDX: ffff8d0d34917800 RSI: ffff8d0d35cd8580 RDI: ffff8d0d2b399a00
[ 5219.993504] RBP: ffff8d0d87abbdb8 R08: ffff8d0da34d8c40 R09: 0000000000000002
[ 5220.001116] R10: 0000000000000000 R11: 0000000000003000 R12: ffff8d0d915e9930
[ 5220.008680] R13: ffffe58dffac2600 R14: 00000000000000c0 R15: ffff8d0d915e9930
[ 5220.016211] FS: 0000000000000000(0000) GS:ffff8d0da34c0000(0000) knlGS:0000000000000000
[ 5220.024747] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5220.030719] CR2: 0000556ef4ce1db8 CR3: 00000000afe06000 CR4: 00000000000006e0
[ 5220.038181] Stack:
[ 5220.040308] ffff8d0d914b2400 ffff8d0d87abbdd0 ffffffffc061184e ffff8d0d915e9800
[ 5220.048170] ffff8d0d87abbdf8 ffffffffc0611aef ffff8d0d909b2480 ffff8d0da34d8c40
[ 5220.056159] ffffe58dffac2600 ffff8d0d87abbe38 ffffffff8909eac2 0000000000000000
[ 5220.064048] Call Trace:
[ 5220.066635] [] nvme_rdma_stop_and_free_queue+0x1e/0x40 [nvme_rdma]
[ 5220.074886] [] nvme_rdma_reconnect_ctrl_work+0x7f/0x1d0 [nvme_rdma]
[ 5220.083235] [] process_one_work+0x162/0x4b0
[ 5220.089394] [] worker_thread+0x4b/0x4f0
[ 5220.095199] [] ? process_one_work+0x4b0/0x4b0
[ 5220.101693] [] ? process_one_work+0x4b0/0x4b0
[ 5220.108080] [] kthread+0xf8/0x110
[ 5220.113441] [] ret_from_fork+0x1f/0x40
[ 5220.119170] [] ? kthread_worker_fn+0x1a0/0x1a0
[ 5220.125594] Code: 66 90 55 48 89 e5 53 48 89 fb 48 8b bf 00 03 00 00 48 85 ff 74 65 0f b6 83 b8 01 00 00 48 8b 13 48 c1 e0 04 48 03 82 f8 00 00 00 <8b> 50 08 f6 c2 04 75 14 83 e2 08 b8 ea ff ff ff 74 07 31 f6 e8
[ 5220.146752] RIP [] rdma_disconnect+0x2e/0x90 [rdma_cm]
[ 5220.153918] RSP
[ 5220.168895] ---[ end trace 4e3fbc3ad0b11617 ]---
[ 5220.168899] Kernel panic - not syncing: Fatal exception
Regards
John
The text was updated successfully, but these errors were encountered: