-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pktgen throws Segment Fault in various test-cases #265
Comments
Pktgen requires DPDK arguments and Pktgen arguments, which I do not see in the first problem. Please send me the complete command line used. For the second error it could be you are loading a very large pcap file. Pktgen has a limit to the size or number of packets in a pcap file, but it should not crash. |
It appears the command line has three NICs defined but only using two NICs. Is that what you wanted? |
Also another test would be to split the two NICs to be handled by two lcore groups -m "[3-4:7-8].0" -m "[5-6:9-10].1". One more item I so not see the pcap file option on the command line, does this happen without a pcap file? Also remove the --socket-mem 8192 as it is not really needed much anymore. Here is the command line I used for some testing and is generated by the ./tools/run.py script. BTW, did you build Pktgen to include Lua? |
Hi @KeithWiles, I also have this same problem. My full command line is: sudo -E ./usr/local/bin/pktgen -l 1,2,3 -n 4 -a 01:00.0 -a 02:00.0 --proc-type auto -- -P -m [2].0 -m [3].1 -f themes/black-yellow.theme -s 0:pcap/traffic_sample.pcap. I have separated the core groups, and specified 1 core for pktgen, and one for each NIC. Lua is not enabled, does it need to be? When inspecting the call to rte_zmalloc_socket, where this fails: all parameters look reasonable, what could the error be? Is it to do with memory allocation for dpdk? |
Yes, DPDK maybe trying to allocate a given amount of memory and failing. Where was the rte_zmalloc_socket being called from? |
Lua does not need to be enabled. |
It is called from pktgen_pcap_open in pktgen-pcap.c |
My output for dpdk-hugepages.py -s is: Hugepages mounted on /dev/hugepages |
How man NUMA zones are on this machine? Also please give me the output from |
Here are all the values passed to rte_zmalloc_socket: numstat -m: this is numastat -m, what am I supposed to be expecting here? |
I see the problem, you only have one Socket/NUMA zone and DPDK is returning -1 on the call to DPDK returning -1 is going to break any request to DPDK based on NUMA zones ID. For this specific call to rte_zmalloc_socket() we can test for -1 and set sid to 0. This problem will most likely happen in other calls that use a socket. :-( |
Is this a problem that is unavoidable due to only having one NUMA node, or is this a fixable error? I'm a little unclear |
In the pktgen-pcap.c file around line 130 change Then replace the line 135 with the following:
I was not able to test this code. |
Okay, it seems to pass the rte_zmalloc_socket call, but generates invalid memory errors down the line, which are related to other rte_zmalloc_socket calls like you mentioned |
Any place in the code where socket ID or NUMA ID is used will most likely have this problem. I guess to fix these I would need to replace all of the socket ID based DPDK calls with a routine verifying the socket ID returned is valid. |
Please give the branch fix-socket-crash on the pktgen repo a try and see if it gets you working. I did not try to fix all of the NUMA/Socket related problems and more issues may exist. |
I will do this ^^, however, where to add pg_zmalloc_socket. I also notice that in other parts, calling rte_eth_dev_socket_id does not return -1, but instead returns very large numbers, I have given the two I found in binary below: 100110101100100 1011111100110100110101100100 These aren't simple ones like just -1 being wrapped around, so I'm not sure why or how this happens. Based on the fact that I only have 1 NUMA node, it should always return sid 0, no? For this reason, the pg_zmalloc_socket does not fix, I am manually changing all instances of sid to 0 to see if that will correctly work. |
The call to get the socket_id should be returning SOCKET_ANY_ID, which is -1 the pg_zmalloc_socket() should detect this and return the correct memory. The other locations in the code that use socket ID will still return -1, in the case of a huge number it is possible the variable used is unsigned value. If the value is unsigned it needs to be signed. |
Not user why the values are strange values, but would need to take each case into account to find out. Please post the locations you find other issues. |
Could be the calls in DPDK are not accounting for this in anyway when only one socket or no NUMA per say at all. |
The locations I noticed are at ... l2p_pktmbuf_create and parse_cores both in l2p.c |
I notice that after hard coding these to 0, it now does not find available ports, very confusing |
Please look at the new Pktgen release 24.10.0 and use the latest DPDK version as DPDK changed again, which caused compile problems. I hope I fixed this issue |
Hi, *** Copyright(c) <2010-2024>, Intel Corporation. All rights reserved. EAL: Detected CPU lcores: 8 |
Please update to the latest Pktgen 24.10.0 and DPDK 24.11.0-rc1 I have updated dpdk.org repo, but please use the one located here. |
Hello there,
I have a bit of trouble getting started with pktgen and hope someone can help me. I want to use pktgen to read and send traffic of a pcap file. While trying to run pktgen, firstly without a pcap, I always get following Error:
I try to simply start pktgen after freshly compiling with:
sudo ./path/to/pktgen
====== Pktgen got a Segment Fault
Obtained 7 stack frames.
./Pktgen-DPDK/builddir/app/pktgen(+0x25e83) [0x5857c4580e83]
/lib/x86_64-linux-gnu/libc.so.6(+0x42990) [0x746baa642990]
./Pktgen-DPDK/builddir/app/pktgen(+0x99a7) [0x5857c45649a7]
./Pktgen-DPDK/builddir/app/pktgen(+0xa793) [0x5857c4565793]
/lib/x86_64-linux-gnu/libc.so.6(+0x28150) [0x746baa628150]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x89) [0x746baa628209]
./Pktgen-DPDK/builddir/app/pktgen(+0xb025) [0x5857c4566025]
I get this error on multiple test-cases. One Testsetup is using an Intel i9 processor the other one has an AMD Ryzen 5 5600. I have tried ubuntu-22.04 and ubuntu-23.10, but I always get the same problem. Of course I tried to set multipile variables, just like the -c, -l, -m, -P etc. but still the same. I originally wanted to setup everything in EVE-NG, but virtualizing everything didnt seem to work. However, using ubuntu-22.04 I was able to compile and start pktgen Version 22.04.1 with dpdk 22.11.1. In this case I could start pktgen with the command mentioned above.
Since I want to use pktgen for reading and sending pcaps, so I tried running pktgen with:
-P -m "[1].0" -s 0:/pcap/imix.pcap
Unfortunately I get following error:
EAL: Error - exiting with code: 1
Cause: pktgen_pcap_open: rte_zmalloc_socket() failed for pcap_info_t structure
I really hope someone can help here.
Sincerely
The text was updated successfully, but these errors were encountered: