Enable gene function search on GO terms #1465
Open
+225
−10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #1388
Changes
Data Model
New tables:
go_term_text
andgo_term_to_pfam_entry
to model GO terms and how they map to PFAM entries. A new migration has been added to create the tables.Ingest
Ingest GO terms by iterating over the "nodes" in the graph defined by: http://current.geneontology.org/ontology/go-basic.json
Ingest PFAM entry to GO term mapping using a file derived from: current.geneontology.org/ontology/external2go/pfam2go
API
New endpoint for text search of GO terms.
Query
New logic that transforms conditions using GO terms to their associated PFAM entries.
UI
New facet for GO has been added.
Testing
To test, you'll have to have to run a local ingest. Make sure you obtain the new file added to NERSC:
.../data/ingest/go/pfam_go_mappings.txt
.Verify ingest populates the
go_term_text
andgo_term_to_pfam_entry
tables, and searching on GO terms works through the UI. That is, make sure searching for GO terms works using the new endpoint, and make sure you get the results you'd expect when searching using this facet. Pick some PFAM value from thegene_function
table, use the new mapping table to find associated GO terms, and use those for testing. For example, GO:0004930 maps to PF00001