-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Curation work to ensure all entries give "canonical" tool descriptions #10
Comments
Curation work to remove remaining entry redundancy ensuring a non-redundant set of “canonical” tool descriptions - this is mostly done but see e.g. bio-tools/biotoolsRegistry#282 |
@hansioan, can we make a definitive list of actions here? To my mind it's this:
|
@bgruening @piotrgithub1 @matuskalas - me, Hans & Herve have been making a major push in content clean-up (mostly ID verification, tool names and redundancy removal) in preparation for data dump (#2). Bearing in mind that the vision for bio.tools is to provide "canonical" descriptions of unique tools, may I ask please that if you have a view on clean-ups that need doing in this regard, to let us know very soon please. e.g. do we satisfy the requirement for integration of data from bioconda etc. We hope to get the clean-up complete by end of next week. |
@joncison what do you need? Imho we can deal with this after the push. Bioconda will deal with whatever bio.tools drop. Bioconda has already started to annotate packages with bio.tools IDs, so ideally they should keep stable and the content should be YAML from our side. But otherwise, we will know more if we start working on it :) |
I was wondering whether any of you guys know already of content issues that would make the integration hard, duplicates (which are now I think nearly all resolved) being an obvious case. We need also to do this clean-up for a paper soon to be submitted (we're all co-authors) - the main reason for doing it now. Rest assured the dump will go ahead ASAP. |
Thanks @joncison! My take on this is, we create the bot and create the content-validation scripts and if things fail, because of duplicates or such, we will know and can fix it. |
very good - which would trap any currently unknown issues (and soon we'll have fixed all the known ones). ps for the validation angles we already have biotoolsLint (currently just harvesting ideas) |
quick update @bgruening and @hmenager : @hansioan and me are making sweeping progress on above, but it's a huge job ... will keep you posted. The (clean) content dump will follow once we're done. |
quick update @bgruening @piotrgithub1 me and @hansioan are done with the clean-ups (huge job) only thing left is a final verification of IDs (for things added in last weeks). Once that's done I'll close this issue. I'm not claiming all the content is now perfect, but it's a lot better than it was a couple of months ago in terms of redundancy, sensible IDs, ownership etc. cc @hmenager |
UPDATE |
One of many issues around GitHub-based content management for bio.tools.
The text was updated successfully, but these errors were encountered: