-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of cyl_bessel_i() on a low-powered arm64 device #92
Comments
You could create a set of test input files of varying 'quality' by adding varying amounts of Gaussian white noise and center frequency shift to see if the precision is an issue for those variables. |
I didn't see any difference in response in terms of the number of packets decoded with valid CRC's. I used Channel Model block and varied noise_voltage and frequency_offset parameters independently in small steps until the number of valid crc's declined to zero. On the other hand, there are too many other LoRa block configurations to make a definite conclusion from this limited experiment. However, a more obvious point is that my 5 Msps sampling rate is somewhat high, and unfortunately, it's the lowest usable rate of my receiver. cyl_bessel_i() is executed in the order of O(samp_rate * 2^sf) times, so reducing the input sampling rate would probably be a simpler approach for my particular problem. |
Nice... a single data point to be sure, but, it's a pleasant single data point. :-)
So your frame_sync of_factor is ... 20? In issue 91 it was suggested that 4 should be adequate. If you filter and decimate by 5, do you still get good results? |
It looks like Low Pass Filter and Rational Resampler are quite CPU intensive. A receiver flow with additional filtering or resampling blocks makes 4x more load and occupies one Cortex-A72 core entirely. I'm going to try just a cheapo 1Msps radio the next week. |
While running the receiver on a low-powered device like Raspberry Pi, I'm seeing a high CPU load. A signal gets sampled at a 5 Msps rate, SF 11, BW 250.
A quick profiling of a run-to-completion flow from a File Source w/o throttling block shows the boost::math::cyl_bessel_i() function takes a substantial time. As it turns out, a default Boost math policy promotes doubles to long doubles the device is struggling to compute with.
The promotion can be disabled as described in https://www.boost.org/doc/libs/1_85_0/libs/math/doc/html/math_toolkit/tradoffs.html:
The fix gives a whopping ~3x speed up on RPi4 without decoding degradation on my signal. However, I don't know whether this long double precision is strictly required and can be downgraded just like that.
The text was updated successfully, but these errors were encountered: