-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional heaps in Holmake #1283
Comments
This sounds plausible. Notes:
|
Do you have to run It seems that the solution is to create a dependency graph that never includes the “optional heaps” as dependencies if they are “bad”, but which still keeps them as targets. That way they do get built for next time. The command that Holmake stores in its graph as the thing to run for a given target would or would not include that heap depending. But this decision doesn't depend on anything being cached on disk. |
OK, I'll add a bit more of my thinking. The dependency information is already cached in the Indeed, I did consider having multiple kinds of dependency edge but no change to heap selection. That approach has some attractions, however I think it would be less effective. It would work well for partial rebuilds, but not for partial builds. Consider the A->B->C example above. On a rebuild caused by a small change in A, this would rebuild the heap used in B, avoid redundant rebuilds of previously successful builds in B, etc. However, when attempting a partial build of one theory in C when nothing else is built, this will still block it waiting on everything in A+B to be built into a heap. |
|
Sorry, I think I'd misunderstood some of your points. I didn't realise To clarify, the problem I see is on the |
To follow up, I'm happy to leave this for the moment and spend some cycles thinking about what we're aiming for here. I guess my claim was that loading of theory objects is "pure" in some sense, so that it's OK to use a heap that contains additional out-of-date theory objects as long as they won't be used, or to use a prior heap which doesn't contain some theories as long as they won't be used. But, if that's our point, is it really heaps we need to improve, or would it be better to try to be faster loading the |
I thought recent profiling suggested it was loading code that was causing the slow-downs, but I agree that it would be nice to know just where the slowness is coming from. Here's my attempt to make at least one scenario concrete. Imagine that directory If I then touch
So, if this scenario is the one we're really interested in, the issue is not stale heaps, it's being able to ignore fresh heaps that we're confident won't affect us. If so, Of course, in this scenario, with the current implementation Given that we have forgotten how heaps behave in practice, I'd recommend starting by building more heaps (or just using the one in |
Loading CakeML’s I have often argued that putting everything and the kitchen sink into |
I'm hoping to start a discussion about a possible feature, motivated by CakeML build times.
In some Holmake runs, a large fraction of the CPU time is spent loading. I can include stats if necessary. Load times (to load up the included theories and libraries) can be tens of seconds in the CakeML repo. This is particularly unsatisfactory when
--fast
is enabled, of course, as proofs are completed rapidly but the load times are the same.PolyML heaps are a potential solution, but have serious drawbacks. As currently implemented, they fill in the dependency graph. Suppose the build sequence has directories A, B, C with some dependencies from A -> B and B -> C. Without heaps, a change to one theory in A may permit a minimal rebuild of affected theories in B + C. Suppose that B/Holmakefile declares a heap containing A, however, that heap will require a rebuild, and trigger a rebuild of every theory in B.
I think that the best of both of these cases can be had with some minimal changes. The heap to use is already decided at startup time by running the
heapname
executable. It seems reasonable to support a case where a heap is used if it is available (to save CPU time in loading it) but not used if it is unavailable or out of date (to save CPU time in building it). The user could opt-in to this mechanism by using a variant of theHOLHEAP
variable.The only tricky bit is to ensure that
heapname
can quickly make the right decision. It should know when a heap is out of date. It shouldn't have to replay holdep or Holmake steps to examine dependencies or timestamps.My idea is for Holmake to put a persistent note into the world when a target is out of date. I propose to do this by saving the actions to be taken as soon as Holmake has built a graph. Specifically, for each target to be built (by a built in command or otherwise), create a file in the relevant
.HOLMK
directory listing some details of the job. If that file remains newer than the target, then the target is out of date. This seems simple and stable enough. It's possibly that the buildheap mechanism also needs to be tweaked to place the final heap object atomically.Does this makes sense?
The text was updated successfully, but these errors were encountered: