You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, I think it's quite hard for people to contribute. The main reason, in my opinion, is that the jupyter notebooks stored in the repository include output of cells and also images etc. The problem with this is that when users make local changes and then commit changes they're presented with huge diffs and when git-pulling in upstream changes they're bound to be presented with huge and insane git merge conflicts, that will pose a huge barrier for non-expert git people.
Right now, I see two possible solutions:
Store the notebooks in a different format and only convert them to notebooks and run them (for the actual website content with cell outputs / images) in CI
One option that I see is notedown, which has a markdown format that converts pretty nicely to/from ipynb format.
Big drawback is that contributors will likely be working in jupyter and then would have to convert to markdown themselves and drop those changed markdown files into the repo and commit them. So, I guess this introduces just different hurdles in the contribution process.
Store the notebooks stripped of all output in the repo, and only execute the notebooks during CI and commit the notebooks including the output in a different location (either in a separate folder in the repo -- or ideally only in the gh-pages branch)
This has the advantage that users can still directly work in the repo and using jupyter, but will get confronted with only minimal diffs of their actual changes.
This could be done using nbstripout which can be installed via anaconda and has a one-liner setup call.
Comments?
The text was updated successfully, but these errors were encountered:
I agree that it is currently much too hard to contribute and I'm note sure how to really improve this. I'd personally would really like to keep the rendered output in the repository for two reasons:
They can be viewed (with results) in github and the notebook viewer web app.
Testing is possible and short-term we do plan to add testing with nbval (https://github.com/computationalmodelling/nbval) - this is really needed to be able to maintain seismo-live - it is already too much work to manually check every notebook.
Add some button to seismo live "Submit notebook" that submits either a modified or a new notebook and just sends it to us via email - this would be a bit more work for us but it might encourage users to be more proactive.
Long term I hope that github provides some kind of nicer interface to edit notebooks.
I'd also be very happy to hear thoughts and opinions of other people!
Currently, I think it's quite hard for people to contribute. The main reason, in my opinion, is that the jupyter notebooks stored in the repository include output of cells and also images etc. The problem with this is that when users make local changes and then commit changes they're presented with huge diffs and when git-pulling in upstream changes they're bound to be presented with huge and insane git merge conflicts, that will pose a huge barrier for non-expert git people.
Right now, I see two possible solutions:
One option that I see is
notedown
, which has a markdown format that converts pretty nicely to/from ipynb format.Big drawback is that contributors will likely be working in jupyter and then would have to convert to markdown themselves and drop those changed markdown files into the repo and commit them. So, I guess this introduces just different hurdles in the contribution process.
This has the advantage that users can still directly work in the repo and using jupyter, but will get confronted with only minimal diffs of their actual changes.
This could be done using
nbstripout
which can be installed via anaconda and has a one-liner setup call.Comments?
The text was updated successfully, but these errors were encountered: