-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add indentation when dumping json. #645
Conversation
Codecov Report
@@ Coverage Diff @@
## master #645 +/- ##
=======================================
Coverage 78.31% 78.31%
=======================================
Files 65 65
Lines 7079 7079
Branches 1310 1310
=======================================
Hits 5544 5544
Misses 1227 1227
Partials 308 308
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to add pretty printing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before synced collections, state point files were indented but job documents were not indented. This PR adds indentation to both files, to which I am opposed.
I generally oppose whitespace in either file and prefer the new behavior of synced collections (no indentation). However, state point indentation was previously a behavior of the package that some users implicitly relied on (e.g. committing state points in a git repository) so I can accept a reversion to that behavior if we believe it is necessary for continuity with past versions. The "pretty" aspect is not important to me, only the fact that synced collections introduces a diff with respect to the previous formatting. Many tools exist for pretty-printing JSON, including one built into Python: python -m json.tool signac_statepoint.json
.
I do not support indenting job documents because that was not the behavior of job documents before synced collections, and the overhead in file size is significant for job documents. Many users have large amounts of data in job documents, and indenting that data will result in an unnecessarily large output file. For a sample job document from my research projects, adding indent=2
changes the size of the file from 2092 characters to 2690 characters, a 28% increase in disk space and a corresponding loss of some measurable amount of performance in reading/parsing/writing.
I would prefer to close this PR without merging and keep the current behavior of the package rather than revert to our implicit behavior pre-synced collections.
@bdice I agree that we should keep both flat, but I could be convinced to have state points follow their old behavior (with a path to making them flat) How hard would it be to use asv to measure the difference (or do none of the tests create enough docs/state points that would have an impact? I don't need convincing but it would be nice to point it to a user who wants the old behavior back. |
I had this change sitting around locally for a couple of months and I found it when I started working on other signac projects. It was specifically requested by a user at some point. That said, synced collections have been out for months and we've only had one user request this change, and he was OK with the new version once he was able to resolve discrepancies between older and newer versions of signac. @bdice I'd be remiss if I didn't remind you that when that issue was originally raised you did care about the pretty-printing for state points :) so I'm not sure what has changed since then. Personally I do not think it's important for us to revert to that old behavior and would be fine closing this PR. Just having it here is sufficient to socialize the knowledge of this change |
I was young and naïve at the time but little has changed. 😉 |
Closing this for now. If we hear many future complaints from users we can revisit the question. |
Description
This PR adds pretty indented formatting to the JSON files dumped by signac.
Motivation and Context
The introduction of synced collections in #484 led to a change in the formatting of the JSON files used for job documents and statepoints. While we do not promise any specific formatting (so this is not technically a bug), these files are much more readable if the original formatting is maintained and users are accustomed to the old formatting.
Types of Changes
1The change breaks (or has the potential to break) existing functionality.
Checklist:
If necessary: