Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add indentation when dumping json. #645

Closed
wants to merge 3 commits into from
Closed

Add indentation when dumping json. #645

wants to merge 3 commits into from

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Nov 14, 2021

Description

This PR adds pretty indented formatting to the JSON files dumped by signac.

Motivation and Context

The introduction of synced collections in #484 led to a change in the formatting of the JSON files used for job documents and statepoints. While we do not promise any specific formatting (so this is not technically a bug), these files are much more readable if the original formatting is maintained and users are accustomed to the old formatting.

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

@vyasr vyasr added the bug Something isn't working label Nov 14, 2021
@vyasr vyasr added this to the v1.8.0 milestone Nov 14, 2021
@vyasr vyasr requested review from a team as code owners November 14, 2021 22:08
@vyasr vyasr self-assigned this Nov 14, 2021
@vyasr vyasr requested review from kidrahahjo and syjlee and removed request for a team November 14, 2021 22:08
@codecov
Copy link

codecov bot commented Nov 14, 2021

Codecov Report

Merging #645 (8c2e54e) into master (8a804d0) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #645   +/-   ##
=======================================
  Coverage   78.31%   78.31%           
=======================================
  Files          65       65           
  Lines        7079     7079           
  Branches     1310     1310           
=======================================
  Hits         5544     5544           
  Misses       1227     1227           
  Partials      308      308           
Impacted Files Coverage Δ
...nac/synced_collections/backends/collection_json.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a804d0...8c2e54e. Read the comment docs.

Copy link

@syjlee syjlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to add pretty printing.

Copy link
Member

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before synced collections, state point files were indented but job documents were not indented. This PR adds indentation to both files, to which I am opposed.

I generally oppose whitespace in either file and prefer the new behavior of synced collections (no indentation). However, state point indentation was previously a behavior of the package that some users implicitly relied on (e.g. committing state points in a git repository) so I can accept a reversion to that behavior if we believe it is necessary for continuity with past versions. The "pretty" aspect is not important to me, only the fact that synced collections introduces a diff with respect to the previous formatting. Many tools exist for pretty-printing JSON, including one built into Python: python -m json.tool signac_statepoint.json.

I do not support indenting job documents because that was not the behavior of job documents before synced collections, and the overhead in file size is significant for job documents. Many users have large amounts of data in job documents, and indenting that data will result in an unnecessarily large output file. For a sample job document from my research projects, adding indent=2 changes the size of the file from 2092 characters to 2690 characters, a 28% increase in disk space and a corresponding loss of some measurable amount of performance in reading/parsing/writing.

I would prefer to close this PR without merging and keep the current behavior of the package rather than revert to our implicit behavior pre-synced collections.

@mikemhenry
Copy link
Collaborator

@bdice I agree that we should keep both flat, but I could be convinced to have state points follow their old behavior (with a path to making them flat)

How hard would it be to use asv to measure the difference (or do none of the tests create enough docs/state points that would have an impact?

I don't need convincing but it would be nice to point it to a user who wants the old behavior back.

@vyasr
Copy link
Contributor Author

vyasr commented Nov 17, 2021

I had this change sitting around locally for a couple of months and I found it when I started working on other signac projects. It was specifically requested by a user at some point. That said, synced collections have been out for months and we've only had one user request this change, and he was OK with the new version once he was able to resolve discrepancies between older and newer versions of signac. @bdice I'd be remiss if I didn't remind you that when that issue was originally raised you did care about the pretty-printing for state points :) so I'm not sure what has changed since then. Personally I do not think it's important for us to revert to that old behavior and would be fine closing this PR. Just having it here is sufficient to socialize the knowledge of this change

@bdice
Copy link
Member

bdice commented Nov 17, 2021

I'm not sure what has changed since then.

I was young and naïve at the time but little has changed. 😉

@vyasr
Copy link
Contributor Author

vyasr commented Dec 3, 2021

Closing this for now. If we hear many future complaints from users we can revisit the question.

@vyasr vyasr closed this Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants