-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network boot taking nearly 2 minutes if STP is active on managed Ethernet switch #480
Comments
One could argue that the longer delay is harder to ignore, and therefore more likely to be detected and solved by turning off STP on that port. |
If detection is the goal: then I suggest the boot firmware prints a big fat warning message if it detects STP is enabled on the network port to alert the user. In the Piserver code I am already displaying a warning on the server if it detects STP is active in the network. |
The DHCP timeout is quite long already and can be overridden in the EEPROM config. Not resetting the PHY is likely to cause as many problems as it solves so I think the answer is that the network boot behaviour is not likely to change in the near future. |
How long does it actually keep sending new DHCP discover packets, as opposed to just waiting for replies? As you can see in my switch log. The network link carrier goes up at 00:20:57.771, and it allows packets to be forwarded at 00:21:24.782 The fine documentation suggests the default timeout is 45 seconds: https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#DHCP_TIMEOUT
And the resetting has to wait until it gets to network boot order? |
To answer my own question.
Last DHCP discover packet is 24 seconds after the first. Your default timeout of 45 seconds may be long, but if it is not sending out any DHCP discovers anymore after 24 seconds, and everything in the first +/- 30 seconds is dropped if the user has a STP enabled switch, that is not going to work... == The fine documentation mentions that it is supposed to be retrying at 4 second intervals.
While it seems more like an exponential back-off delay starting at 8 seconds in practice. |
Describe the bug
On managed Ethernet switches that have the Spanning Tree Protocol enabled, every time the network link goes up it tests for network loops for +/- 30 seconds before it lets normal traffic through.
In an ideal world, everyone buying a managed Ethernet switch would know how to properly configure it, and only enable full STP on uplink switch ports that connect to other Ethernet switches, and disable it (or set it to portfast/rapid STP flavour) on Ethernet ports leading to host devices like computers and Pi.
We could provide some education on how to do configuration, but some users may still struggle with the console cable and CLI commands necessary (not all switches offer a convenient option to set this in the management web interface).
And therefore it is realistic to expect that some will end up using a switch that has STP enabled, and suffer from this problem.
In that case my expectancy would be that network boot is delayed by 30 seconds, but not much more.
But reality is if STP is enabled it currently takes the boot firmware like 1 minutes 50 seconds instead:
It waits 25 seconds for USB MSD to timeout, goes to network boot, does not succeed because switch port is not forwarding traffic yet, goes to other boot methods again having to wait 25 seconds for USB MSD again, does succeed to network boot the second time.
If I look at the events from the switch:
Would it be possible to change the boot firmware so that it only set the network link carrier to up at the very beginning, and no longer resets the network link when it goes from USB MSD to network boot mode?
So that the 25 seconds spent waiting for USB MSD to timeout, would count towards the forwarding delay.
And it would be nice if it would try DHCP a little longer. The PXE standard demands DHCP discover is retried at 4, 8, 16 and 32 seconds for a reason.
Steps to reproduce the behaviour
Device (s)
Raspberry Pi 4 Mod. B, Raspberry Pi 400, Raspberry Pi CM4, Raspberry Pi CM4 Lite
The text was updated successfully, but these errors were encountered: