-
Notifications
You must be signed in to change notification settings - Fork 17
Dictionary: Diseases
Priya
Disease
This dictionary contains the names of diseases
that commonly occur and the names of co-occurring diseases
during any viral epidemic. By using this dictionary via ami search
along with more dictionaries(if needed), will this be able to find comorbidity during viral epidemics? (trying to explore)
[Check out this video to use a dictionary
via our new software ami
!]
From Wikidata
using Wikidata Query Service (SPARQL).
The latest
and multilingual
dictionary - From ICD-10
and SPARQL
.
The creation of these 5 disease
dictionaries are mentioned at
https://github.com/petermr/openVirus/tree/master/dictionaries/diseases/disease_dict.md
Created dictionary derived from SPARQL as
-
text file
- https://github.com/petermr/openVirus/tree/master/dictionaries/diseases/disease_new.xml -
xml file
i) synonym
dictionary - https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/disease_synonym.xml
ii) latest
dictionary with ICD-10 codes
- https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/disease.xml
iii) multilingual
dictionary of 6 languages - https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/disease_snh16Oct.xml
-
Spanish dictionary
: https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/enfermedad.xml
To iterate synonyms in the disease
dictionaries for getting accurate results for ami search
.
I have updated the synonym dictionary for disease
at https://github.com/petermr/openVirus/blob/master/dictionaries/test/disease_synonym.xml. I had a look at few pages and I got some questions with the synonyms
created in the dictionary...
- The
synonyms
included some common words/letters/numbers like2
,Male
,face
,and neck
,X
,X-linked
and so on. If this dictionary was used inami search
, will it then createDataTables
including these common words? - I saw some words containing some special letters but they contain wikidata id or were mentioned in the synonym like
Uberkoten
(the correct special letters are not able to mention) has wikidataQ332590
,Chédiak
&François
were mentioned insynonyms
. Is that okay to be left or should be manualy removed? - Some entry names contain the Wikidata id instead of names. The wikidata id
Q886810
&Q1607642
containno entry name
but the id itself. Should I remove them manually? - In the synonyms, there were bracket words like
<synonym>Dwarfism : [pitutary] or [hypophyseal (& Lorain - Levi)]</synonym>
. Will it be used altogether or separately or both? If only altogether, might it misses some words/terms? - Some synonyms were repeated more than twice in different entries like
NOS
,X-linked
. Should they be removed? - There were also
acronyms
mentioned in thesynonyms
likePAN
(Polyarteritis nodsa),DISH
(Diffuse Idiopathic Skeletal Hyerostosis), etc.
- Downloading as a csv document from SPARQL and converting into a text file does many changes in the diseases' names such as the symbols
-
and somediacritics
were changed into some special characters like–
,ü
. They were manually rectified and this took a lot of time. - In the text document, only the disease name can be mentioned not other attributes like description, altLabels, etc. Only one disease name should be in one line. (The other few attributes need to be added along the syntax during dictionary creation and it still needs development.)
- Downloading as a xml file from SPARQL does not need any manual rectification. They can be used to create dictionaries directly using similar syntax(though need some improvements in code and its day-by-day developing).
- The other attributes for disease (or any other dictionary) can be mentioned in the xml file and can be converted into the dictionary easily.