-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add clock skew and hold time checks #1360
Conversation
One thing that seems to be missing here is clock delay/skew analysis. As it is inherently unlikely to have a hold time failure without clock skew (because in almost all cases, the clock to out time of a flipflop plus minimum routing is more than the hold time requirement of a DFF input), I think it would be good to do that at the same time. However, it will require some consideration about performance, because this does add a bit of extra runtime to do the clock side analysis. Probably okay, but worth benchmarking and considering an option to disable it too. |
That's true. I'm in the process of adding it. The problem I'm facing is that global nets on ECP5 (which I use as my development and testing target since I know most about it) returns zero due to: nextpnr/common/kernel/context.cc Lines 117 to 123 in 50bd8d0
If I comment that ifdef I do get delays for the clock nets. Is this simply a case of ECP5 not having this information? |
Yeah, unfortunately so, due to some questionable design decisions at the time they also aren't routed in a way that makes this easy to implement, either... |
Is this the case for all architectures or only ECP5? If only on ECP5 I will simply switch to a different one and disable clock skew checking on ECP5. |
Only ECP5, although as this is not a feature we've had in the past, I won't 100% guarantee they're actually accurate across all architectures. At a minimum they should be for iCE40, although its global networks are good enough and the device small enough that there's no skew (to test clock skew/hold analysis, you could always disable the use of them and then see a bunch of skew through the non-global routing). |
Can I also add an option to disable global promotion for ECP5 like for ice40? I rather keep my test scripts the same. Even if not used in production it will allow me validate this PR. This will also close: #1043 |
23d4f13
to
0334ca2
Compare
I think this is now in a reviewable state. I initially wanted store the clocks delay from all source FFs to sink FFs but this would become large and slow. I now moved the arrival and required times all in reference to the clock source. I also designed this PR to have a nice springboard to add I did not yet benchmark. I will focus on that tomorrow. But I think the solution I choose shouldn't be very heavy. In the hot spots it's just an add/subtract more. What is more slowly now is the Remaining TODOs:
|
I also changed the timing report to include the type of segment it is.
This is mainly done to make it more clear. Especially now with this PR there can be negative segments like the |
@@ -180,6 +181,20 @@ enum PortType | |||
PORT_INOUT = 2 | |||
}; | |||
|
|||
[[maybe_unused]] static const std::string portType_to_str(PortType typ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These and other enum to string function where mainly used during development. In the past I have often defined them. And now had to again. I would prefer to keep them even if currently unused because they can be handy during development.
ad49765
to
9be280b
Compare
if (fanin.type == CellArc::HOLD && fanin.other_port == ep.second) | ||
init_setuphold.max_delay -= fanin.value.maxDelay(); | ||
init_required.max_delay += fanin.value.maxDelay(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the hold time had the wrong sign.
Hold slack is calculated like arr.value.minDelay() - req.value.maxDelay() + clock_to_clock
. If you say here that hold is negative then it becomes arr.value.minDelay() + hold_time - <rest of required time> + clock_to_clock
which is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly relevant here but some related food for thought - if you use minimum clock/routing delays for the arrival time and maximum for the required time, this is likely to be unnecessarily pessimistic, particularly on the clock side where it is probably that some of the routing is shared so it can't physically have two different delays.
But in general, nextpnr doesn't really have a clear concept of whether min/max delays are overall or for a specific corner (iirc Vivado has fast-min, fast-max, slow-min, slow-max for example which might actually be a better use of the four values DelayQuad than seperate rising/falling delays). This means that as well as shared routing that's guaranteed to have the same delay; as the entire chip will be at essentially the same temperature, voltage and process corner; the difference between delays will never be as high as min vs max.
To avoid this pessimism I think it might be better to just use one corner of routing delays (e.g. min delays everywhere) for hold analysis. Even if this is slightly optimistic it's unlikely for any of the FPGAs we deal with that on-chip variation is going to be that high (but OCV is a field in itself...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it has no impact since the route delay is just a delay_t
(via delay_t Context::getNetinfoRouteDelay
) so the min/max corners are the same. But a comment there would be nice if this ever changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that should be fine for now then
9be280b
to
c73c327
Compare
922b7d3
to
9dd9888
Compare
I did 100 runs over the night with 18k lut design current master vs this branch:
Doesn't seem to make an appreciable difference as expected. clock skew analysis was enabled during routing and final reporting. Since this doesn't have a real performance impact I would say just keep it on during routing and final reporting. |
@gatecat Do you still want an explicit option to disable it? Due to the low performance impact and the extra safety I would just keep it always enabled. Perhaps even while doing placement. |
Hmm I think I'm seeing some weird things. Do not press the merge button yet :). |
Seems fine to keep it always enabled based on the metrics you've shown |
timing: Add clock2clock delay as seperate timing line item.
timing_log: Use common segment type strings
timing: Fix max_delay_by_domain_pair function timing: Fix hold time check
timing: Disable clock_skew analysis by default
9b9e1ac
to
2550b14
Compare
I'm now confident this works correctly. The hold slack now matches the path delay reported which was not the case due to min/max bounds of a delay pair not being chosen correctly. So if you are happy with it. it can be merged. Next up for me is min/max delay constraints. Since both can make use of existing infrastructure in the timing analyzer this shouldn't be too hard anymore. Ex of hold time report:
|
Comments have been handled. |
Based of #1359