You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a curator
I want to be able to add accession numbers of externally hosted linked data
So that we can link directly to relevant data hosted in external repositories such as INSDC
Acceptance criteria
Given I have a BioProject accession (e.g. PRJNA144099) number that I wish to add to a GigaDB dataset
When I add the accession number with the prefix "BioProject:" to the dataset_link table e.g. "BioProject:PRJNA144099"
Then the link to the relevant URL is included in the GigaDB dataset page depending on the logged in user preference or unlogged in default:
default = https://www.ebi.ac.uk/ena/browser/view/PRJNA144099
NCBI = https://www.ncbi.nlm.nih.gov/bioproject/PRJNA144099
EBI = https://www.ebi.ac.uk/ena/browser/view/PRJNA144099
DDBJ = Do not display NCBI or EBI submitted BioProjects so this is not valid for all BioProject accessions
Additional Info
The user story for the website user perspective is #17
INSDC archives such as SRA, BioProject, BioSample and GenBank are mirrored in 3 different repositories around the world; NCBI in USA, EBI in Europe, DDBJ in Asia. People have their own preferences on which of these repositories they prefer to use and we currently attempt to allow the registered users to choose which they are sent to. This does cause complications in the link-prefix table! and that is why the entire method currently being used probably needs an overhaul!
NB for BioProjects there are regex to those accessions based on their origins, see list here
We will need the ability to add new link prefixes quickly and easily, hence the current admin page for link prefixes.
The current list of link prefixes needs tidying up! Frankly, it's soo bad I don't even know how it's still working!
Things to correct:
EBI/NCBI/DDBJ has been added to various entries that are not even mirrored in those 3 institutes!
at least 2 prefixes are present for ontologies (DOID MEDDRA), no idea why or if they are used for anything, I can't see any reason why they should be included here.
yahoo? they dont provide accessions?!
http ? why is that there?
an old entry for EGA with outdated URL remains, even though there is a new one also!
PXD = ProteomeXchange
ERA has changed name to ENA
PROJECT should be BioProject
We should include RRIDs
There is also a ticket #279 suggesting we add a column for regular expression value of accessions, which is a good idea
In addition, it would perhaps be useful to include a short description of each row to enable help icons on website to assist users in choosing the correct prefix (in the future).
Also we need to consider the implications of any changes made to the link prefix table on the display of datasets.
We may want to add mandatory checks in the admin interface before changes are actioned, i.e. two curators sign off on changes, or URLs are tested and confirmed or something else?!
Here I list the accession number providers that we either already have or know we should be ready to accept:
This could be considered part of Epic #597 but be aware the accession links are NOT stored in the "external_links" table they are in the "links" table- This MIGHT be a good time to revisit the logic of the schema design and to merge the external_links and links tables into 1 table, but that will need some thought and possibly has further reaching consequences.
Certain "Attributes" should be linked to external resources. NB - those links should all open new windows (i.e. anything linking away from GigaDB.org should open a new window leaving the gigadb.org page in background).
Specifically all accessions;
e.g.
Attribute_name should_link_to{EBI} OR {NCBI} depending on personal preferences.
alternative accession-GEO {need to look up GEO URLs}
links to additional analysis {value will be URL or DOI and should be hyperlinked}
relevant electronics resources {value will be URL or DOI and should be hyperlinked}
Product Backlog Item Ready Checklist
Business value is clearly articulated
Item is understood enough by the IT team so it can make an informed decision as to whether it can complete this item
Dependencies are identified and no external dependencies would block this item from being completed
At the time of the scheduled sprint, the IT team has the appropriate composition to complete this item
This item is estimated and small enough to comfortably be completed in one sprint
Acceptance criteria are clear and testable
Performance criteria, if any, are defined and testable
The Scrum team understands how to demonstrate this item at the sprint review
Product Backlog Item Done Checklist
Code is complete
Automated tests related to the changes are implemented and passing
All automated test suites are passing locally
Code is refactored to best practices and coding standards
Documentation is updated as needed
A Pull Request has been created and review requested
Pull Request is reviewed and approved
The item has been merged to the develop branch
All automated test suites are passing on continuous Integration pipeline and item is ready to release
The text was updated successfully, but these errors were encountered:
it might be worth checking how the curies idea in #424 might be implemented as its a synonymous system and the Names to Things application might be used here or the bioregistry might be used there
@cthoyt is keen to encourage us to use Bioregistries for this task, and they have various tools that may make it easier for us to implement it, worth having a discussion with them before starting work on it. Including valid regex for various things that we use.
Yes, I'm also happy to make any improvements to the existing software/data to support your use case. We're also thinking about reimplementations in other languages, too, if a combination of python packages and web api endpoints isn't sufficient
User story
Acceptance criteria
Additional Info
The user story for the website user perspective is #17
INSDC archives such as SRA, BioProject, BioSample and GenBank are mirrored in 3 different repositories around the world; NCBI in USA, EBI in Europe, DDBJ in Asia. People have their own preferences on which of these repositories they prefer to use and we currently attempt to allow the registered users to choose which they are sent to. This does cause complications in the link-prefix table! and that is why the entire method currently being used probably needs an overhaul!
NB for BioProjects there are regex to those accessions based on their origins, see list here
We will need the ability to add new link prefixes quickly and easily, hence the current admin page for link prefixes.
The current list of link prefixes needs tidying up! Frankly, it's soo bad I don't even know how it's still working!
Things to correct:
EBI/NCBI/DDBJ has been added to various entries that are not even mirrored in those 3 institutes!
at least 2 prefixes are present for ontologies (DOID MEDDRA), no idea why or if they are used for anything, I can't see any reason why they should be included here.
yahoo? they dont provide accessions?!
http ? why is that there?
an old entry for EGA with outdated URL remains, even though there is a new one also!
PXD = ProteomeXchange
ERA has changed name to ENA
PROJECT should be BioProject
We should include RRIDs
There is also a ticket #279 suggesting we add a column for regular expression value of accessions, which is a good idea
In addition, it would perhaps be useful to include a short description of each row to enable help icons on website to assist users in choosing the correct prefix (in the future).
Also we need to consider the implications of any changes made to the link prefix table on the display of datasets.
We may want to add mandatory checks in the admin interface before changes are actioned, i.e. two curators sign off on changes, or URLs are tested and confirmed or something else?!
Here I list the accession number providers that we either already have or know we should be ready to accept:
More info
http://gigadb.gigasciencejournal.com:9170/adminLinkPrefix/update/id/23
no source should be legal
http://gigadb.gigasciencejournal.com:9170/adminLink/admin
at the moment, prefix and accession number are in same column, should be separated
copied from #824
This could be considered part of Epic #597 but be aware the accession links are NOT stored in the "external_links" table they are in the "links" table- This MIGHT be a good time to revisit the logic of the schema design and to merge the external_links and links tables into 1 table, but that will need some thought and possibly has further reaching consequences.
Certain "Attributes" should be linked to external resources. NB - those links should all open new windows (i.e. anything linking away from GigaDB.org should open a new window leaving the gigadb.org page in background).
Specifically all accessions;
e.g.
Attribute_name should_link_to{EBI} OR {NCBI} depending on personal preferences.
alternative accession-BioSample {http://www.ebi.ac.uk/ena/data/view/_accession_ }{http://www.ncbi.nlm.nih.gov/biosample/?term=_accession_ }
alternative accession-BioProject {http://www.ebi.ac.uk/ena/data/view/_accession_ } {http://www.ncbi.nlm.nih.gov/biosample/?term=_accession_}
alternative accession-SRA_project {http://www.ebi.ac.uk/ena/data/view/_accession_ } {http://www.ncbi.nlm.nih.gov/biosample/?term=_accession_}
alternative accession-SRA_sample {http://www.ebi.ac.uk/ena/data/view/_accession_ } {http://www.ncbi.nlm.nih.gov/biosample/?term=_accession_}
alternative accession-SRA_experiment {http://www.ebi.ac.uk/ena/data/view/_accession_ } {http://www.ncbi.nlm.nih.gov/biosample/?term=_accession_}
alternative accession-SRA_file {http://www.ebi.ac.uk/ena/data/view/_accession_ } {http://www.ncbi.nlm.nih.gov/biosample/?term=_accession_}
alternative accession-GEO {need to look up GEO URLs}
links to additional analysis {value will be URL or DOI and should be hyperlinked}
relevant electronics resources {value will be URL or DOI and should be hyperlinked}
Product Backlog Item Ready Checklist
Product Backlog Item Done Checklist
The text was updated successfully, but these errors were encountered: