-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brainstorm how example data usage repos will look like, e.g. folders and files #24
Comments
DRYAD ZENODO GitHub (Robert Koch Institut) |
The pages I've looked at above all showcase the actual data, I can see how we could substitute the metadata for the data, but then I'm not entirely clear on how much project information we want to collect etc. I think there are a few things I need to have clarified.
|
Hmm, almost, but not quite. You are right that Sprout does not have an UI to display back to the user, at least these example repos won't. The purpose of these examples is for users to see how they could write the code to use Seedcase software using their own data. So it is an informational/reference repo, way more detailed than a how-to guide. What it should look like at the end is something like how Quarto has their gallery: https://quarto.org/docs/gallery/#dashboards For instance, check out the "Gapminder Dashboard" in the Dashboard section. You'll notice when you click the "source" section, it takes you to a GitHub repo: https://github.com/jjallaire/gapminder-dashboard/tree/main Notice who the owner of the repo is, So everything in the example repos will be code to convert a messy real dataset into a tidier Seedcase organized dataset. So, likely at the root of the repo will be a file called like |
Ah - would our repos be more informative, as in not just a single file with the uncommented code, but also something that says 'first you do this, then you do this, and finally you do this, and here is the result'? Because I admit that looking at that page of code doesn't really tell me a lot about how to go about getting started. |
Hmm, no, the purpose would be entirely as a reference. The how-to guides are what users would go to in order to learn how to do things and get started. Plus, it would make our work just a bit easier. |
So use the data set downloaded with the how-to guide and put the result on the repo? |
Yea, exactly! We could do it two ways:
Maybe a folder and file structure for the repos for 1) would be:
And for 2):
And the code to create the packages using core, lib, or cli would be in those We should probably do both 1) and 2) at some point, but which do we want to start with? The advantage of having all data in one repo is there is less to organize. The disadvantage is that it gets tricky to know which dataset links to which data package just by the folder structure. I personally am leaning towards 1). |
We could use 1, and then make a top level folder called something like data-examples. I really like to readability of the first one, it is so simple and easy (at least for me) to understand. It made perfect sense! I'd like to set up the repo once we get to that, then you can check and let me know what I'm missing. |
Just to be clear, you mean we could use 1? And the top level folder will always be As for making the repo, anyone on the team has permissions to create a repo 😌 You go ahead and set it up and I can add the general infrastructure around it 🤩 😁 |
Just a small edit to the file structure of 1):
So that the processing scripts can be in one location and so that we can include a README to describe what we are doing with them and why, without having to pollute the root README. |
Yeah - I've edited my comment. What I was thinking was one example repo, with folder between the main folder and the scripts folder (see below), so that we don't end up with five repos clogging up our overview.
|
I have a feeling we will end up needing to go the "one repo, one data package" approach in the long term, but we can split things up once we get there. |
No description provided.
The text was updated successfully, but these errors were encountered: