Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Public Galaxy Server and Tool Metadata #440

Open
tnabtaf opened this issue Jul 23, 2020 · 5 comments
Open

Public Galaxy Server and Tool Metadata #440

tnabtaf opened this issue Jul 23, 2020 · 5 comments
Labels
discussion enhancement New feature or request help wanted Extra attention is needed

Comments

@tnabtaf
Copy link

tnabtaf commented Jul 23, 2020

Had a conversation with @matuskalas at 2 consecutive Galaxy CoFests about improving the search functionality of the Galaxy Platform Directory. In between those two conversations I had a conversation with my bosses about increasing the amount of information related to Galaxy in Bio.Tools and GalaxyCat.

It became obvious while talking with Matúš at this year’s CoFest that these goals complement each other nicely.

This issue could be created in many places:

Eventually there may be pull requests in many of those places (including ToolDog).

Goals

Increase presence of public Galaxy servers and their tools in Bio.Tools and GalaxyCat.
Increase Awareness of Bio.Tools and GalaxyCat in the Galaxy Community.
Simultaneously, make the Galaxy Platform Directory contain more useful and searchable information about those platforms.

How?

That’s what this issue is here to discuss. One item seems uncontroversial to me:

  • We should use ontology terms to do this. EDAM and a Taxa ontology seem most useful, but others also have obvious applications. For example, RepeatExplorer is all about repetitive elements, and that suggests the sequence ontology

And a starting smattering of open questions:

  • Which ontologies? Any ontology that is available in a lookup service, or only a core set of ontologies?
  • How do we add new ontologies, either to our limited list, or that aren’t in our selected lookup service? For example the Climate Workbench server may use ontologies that aren’t in biology-centric lookup services.
  • Where and how do we store and access server ontology information? On the server itself, seems like a good idea, but adding this to metadata in the hub might be a good fallback.
  • A fair amount of work (I think) has gone into supporting EDAM annotation of individual tools. How can we encourage tool wrappers to actually use this functionality?
  • How do we make the Galaxy, and larger bioinformatics communities aware of these resources, once they are updated?
@tnabtaf
Copy link
Author

tnabtaf commented Jul 23, 2020

@hexylena
Copy link

Increase presence of public Galaxy servers and their tools in Bio.Tools and GalaxyCat.
Increase Awareness of Bio.Tools and GalaxyCat in the Galaxy Community.

I think the only way to increase awareness is embed it in Galaxy somehow, or no one will find this unrelated website without a lot of work. Maybe at the bottom of search results like this:

image

and just link to GalaxyCat (not search by default!) (And also update their data...)

How can we encourage tool wrappers to actually use this functionality?

I think Galaxy needs to do more there. E.g. showing the ontologies somewhere in the UI, allowing searching on inputs/outputs of those ontologies, etc.

@matuskalas matuskalas added discussion enhancement New feature or request labels Jun 17, 2021
@NickSto
Copy link

NickSto commented Jul 8, 2021

This is very interesting. Publishing the metadata on the servers themselves seems natural, but then my question is how the Galaxy Platform Directory finds out which servers to check in the first place without continuing to store its own list of servers.

@matuskalas matuskalas added the help wanted Extra attention is needed label Jul 10, 2021
@matuskalas
Copy link
Contributor

Some thoughts on this discussed in the GCC2021 CoFest:

  • Add bio.tools IDs and ontology concept IDs into the Galaxy server "metadata"
  • Have these accessible via something like <galaxyServerUrl>/api/about, together with Galaxy version etc.
  • Create an issue on galaxyproject about this

How this data will be integrated and viewed elsewhere:

  • I suppose the server listing at the galaxyproject.org community website would want to show these, and so would GalaxyCat and bio.tools
  • Galactic Rediotelescope could perhaps help gathering the server data
  • GalaxyCat a read-only view(?)
  • 2 options to proliferate the data:
    • Galaxy (Radiotelescope) -> Tools Ecosystem -> GalaxyCat
    • Galaxy (Radiotelescope) -> GalaxyCat -> Tool Ecosystem
  • Later in the future, we might consider if we also want to sync updates of the data from the Tools Ecosystem into Galaxy

@hexylena
Copy link

GRT isn't the right route (that's opt-in, and very few will be part of that) there's a better one, the public server list (+scraper)! https://github.com/martenson/public-galaxy-servers/

We use that script on a cron job which pulls in the /api/configuration route from like 100 servers on a regular basis.

We had a page here that showed stats collected from all public galaxy servers, I'll fix it on monday.
https://stats.galaxyproject.eu/d/000000020/public-galaxy-servers?orgId=1&from=now-7d&to=now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants