-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some elementary EDA to interactive dataset pages #126
Conversation
@janosh woah. this is, like, super cool. I think directly adding them to MPcontribs as something that would be dynamically generated on MP's side is opening a can of worms (i.e., we would have to pester @tschaume to upkeep it and update this code, which is something I don't think he wants to do) I like the idea of adding it to I can merge this in and then add the ipynb code to |
Good to know there's interest! I'm happy to do that and save you some trouble.
My bad, the many changed files are the result of running a few CLI commands that apply auto-fixes like |
@ardunn Alright, here's a rough draft for a function I saved all figures as HTML with Also, for speed of regenerating these plots, |
@mkhorton Suggested we could put these plots on https://materialsproject.org/ml (which apparently has just been a placeholder so far) and make them interactive using dash/crystal-toolkit. |
Yes that sounds great! RN there seems to be some conflicts (I tried to fix but I may have messed something up lol) |
Oops, did you already solve the conflicts in e8b88e6? If so, I'll undo my merge. I wasn't quite sure if I'd merged everything correctly anyway. 2 months ago now, can't quite remember what parts I wrote. |
@janosh I think I tried but something got messed up. I think maybe the easiest thing is to undo the merge (without re-linting afterwards, I think that was the source of many conflicts) edit: Oops didn't mean to close this |
I undid the merge. Now |
…num and crys_sys but no structures could also gitignore these and generate on the fly to keep repo size down
Sure, that seems fine to me! |
That part is done so I think this is good to go. The CI errors seem to be the issue described in hackingmaterials/matminer#840 and not related to my changes. |
On all the dataset details pages, we could show some interactive Plotly plots.
Taking Matbench Dielectrics as an example,
https://ml.materialsproject.org/projects/matbench_dielectric
here are some ideas for plots that would render below the table on each dataset page:
Matbench Dielectric EDA
Code
Here's a Jupyter notebook to generate those plots. Doing the same for the other data sets would be mostly just swapping out the argument to
load_dataset
and not generating some plots for composition-only tasks.@ardunn I'm not sure where this code would live exactly if you think it's worth adding. Would it run as part of
rebuild_docs.py
? Some guidance there would be great!