-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make ANTLR3 produce Reproducible output #209
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Hervé Boutemy <[email protected]>
Signed-off-by: Hervé Boutemy <[email protected]>
e8866a1
to
75e5e87
Compare
Is it removing the timestamp going to break any codes on it? I agree it was a dumb idea but I'm afraid to change it now. haha |
perhaps adding an option for users to choose if they want this timestamp or not is a better choice (like JAXB that provides |
I'm generally opposed to options, I'm afraid. In this case it's a fairly heavy change just to get rid of this date output, which in retrospect was definitely a mistake on my part. I agree that the should be reproducible but I'm not sure risking backward compatibility is worth it. I do know that some companies simply remove that line using their build tools. Is this possible with maven? |
yes, we do it with maven-replacer-plugin |
one question: there are 2 sources of non-reproducible bit
fixed by the sorting in the commit on file tool/src/main/antlr3/org/antlr/grammar/v3/CodeGenTreeWalker.g is it possible to fix the reproducibility issue for the elements, please? This would reduce the places where we need to postprocess |
Instead of totally dropping the timestamp, antlr could be updated to support SOURCE_DATE_EPOCH for the build-time-stamp. This should leave any existing use-cases for the build-stamp undisturbed but make it easy for users to opt-in to a deterministic mode using this standard. I have also observed non-determinism in the order of methods in the "Delegated rules" section of the generated parser. Other hazards from looking through the templates and the code:
My audit is far from complete though. Things to ponder: One could replace all HashSets and HashMaps by their Linked versions but that seems like a rather heavy hammer. It would also be nice if the collections that needed to have deterministic iteration had a distinguishing static type to avoid re-introducing problems. |
as found while rebuilding projects using ANTLR3 (including ANTLR3 itself), there are non-reproducible outputs at 2 levels:
this PR fixes the 2 issues in 2 separate commits: