You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After discussion with Andrew yesterday, we agreed that these changes should be made:
changing the parser to create association-centric data (unique combos of drug-disease-status) rather than drug-centric (current) would be helpful, particularly for the upcoming "treats" refactor. I wrote more about the problems with the current data structure in the linked issue Data source: repoDB #77 (comment)
Mockup of what association-centric data may look like
It'd be transformed into multiple records, 1 for each combo of rituximab + unique disease + unique status.
So for rituximab + "Lymphoma, Non-Hodgkin" C0024305, there'd be 3 records (3 diff statuses). I didn't include all the info for the "Terminated" record since there's currently 18 objects/clinical-trials in the data.
Figure out what the field value "NA" means. If it basically means "not available/applicable", I'd find it helpful if the parser removed the fields with "NA" values. That way BTE would be able to use this field without post-processing to remove "NA".
"NA" is a common value for these fields
repodb.indications.NCT, but the non-"NA" info could be useful publication ref info for BTE
repodb.indications.phase: BTE may need to use this info in the future as part of the treats-refactor
Double-check whether this API is using the latest data from repoDB (v2.1 2023-06-15) in the version history section of the repodb website). Based on the metadata endpoint, it might be using the latest data. But the original development and deployment was in 2022 before that data release.
The text was updated successfully, but these errors were encountered:
I think this issue has been addressed, so I'm closing it. I noted that all instances of the APIs were updated here. There were also detailed discussions in the lab Slack (one thread here that ended with all changes agreed on and deployed to CI).
@everaldorodrigo I suggest adding links to the PRs/code changes related to this issue.
PRIORITY: medium. It'd be useful to have for the upcoming biolink-model refactor ("treats"). Higher in priority than #170
While writing the SmartAPI yaml w/ x-bte annotation for BioThings repoDB, I noticed some issues.
After discussion with Andrew yesterday, we agreed that these changes should be made:
Mockup of what association-centric data may look like
Right now, there's 1 record for the drug Rituximab.
It'd be transformed into multiple records, 1 for each combo of rituximab + unique disease + unique status.
So for rituximab + "Lymphoma, Non-Hodgkin"
C0024305
, there'd be 3 records (3 diff statuses). I didn't include all the info for the "Terminated" record since there's currently 18 objects/clinical-trials in the data."NA" is a common value for these fields
The text was updated successfully, but these errors were encountered: