Skip to content

Dictionary: Diseases

Lakshmi Devi Priya edited this page Oct 26, 2020 · 17 revisions

Owner :

Priya

Dictionary :

Disease

Overview :

This dictionary contains the names of diseases that commonly occur and the names of co-occurring diseases during any viral epidemic. By using this dictionary via ami search along with more dictionaries(if needed), will this be able to find comorbidity during viral epidemics? (trying to explore)

[Check out this video to use a dictionary via our new software ami!]

Source :

From Wikidata using Wikidata Query Service (SPARQL).

The latest and multilingual dictionary - From ICD-10 and SPARQL.

Creation :

The creation of these 5 disease dictionaries are mentioned at

https://github.com/petermr/openVirus/tree/master/dictionaries/diseases/disease_dict.md

Find here :

Valid Disease Dictionary

Created dictionary derived from SPARQL as

i) synonym dictionary - https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/disease_synonym.xml

ii) latest dictionary with ICD-10 codes - https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/disease.xml

iii) multilingual dictionary of 6 languages - https://github.com/petermr/openVirus/blob/master/dictionaries/diseases/disease_snh16Oct.xml

Need to be done :

To iterate synonyms in the disease dictionaries for getting accurate results for ami search.



Issues and doubts

I have updated the synonym dictionary for disease at https://github.com/petermr/openVirus/blob/master/dictionaries/test/disease_synonym.xml. I had a look at few pages and I got some questions with the synonyms created in the dictionary...

  1. The synonyms included some common words/letters/numbers like 2, Male, face, and neck, X, X-linked and so on. If this dictionary was used in ami search, will it then create DataTables including these common words?
  2. I saw some words containing some special letters but they contain wikidata id or were mentioned in the synonym like Uberkoten (the correct special letters are not able to mention) has wikidata Q332590, Chédiak & François were mentioned in synonyms. Is that okay to be left or should be manualy removed?
  3. Some entry names contain the Wikidata id instead of names. The wikidata id Q886810 & Q1607642 contain no entry name but the id itself. Should I remove them manually?
  4. In the synonyms, there were bracket words like <synonym>Dwarfism : [pitutary] or [hypophyseal (& Lorain - Levi)]</synonym>. Will it be used altogether or separately or both? If only altogether, might it misses some words/terms?
  5. Some synonyms were repeated more than twice in different entries like NOS, X-linked. Should they be removed?
  6. There were also acronyms mentioned in the synonyms like PAN (Polyarteritis nodsa), DISH (Diffuse Idiopathic Skeletal Hyerostosis), etc.

Preferring xml file

TEXT FILE from SPARQL

  • Downloading as a csv document from SPARQL and converting into a text file does many changes in the diseases' names such as the symbols - and some diacritics were changed into some special characters like –,ü. They were manually rectified and this took a lot of time.
  • In the text document, only the disease name can be mentioned not other attributes like description, altLabels, etc. Only one disease name should be in one line. (The other few attributes need to be added along the syntax during dictionary creation and it still needs development.)

XML FILE from SPARQL

  • Downloading as a xml file from SPARQL does not need any manual rectification. They can be used to create dictionaries directly using similar syntax(though need some improvements in code and its day-by-day developing).
  • The other attributes for disease (or any other dictionary) can be mentioned in the xml file and can be converted into the dictionary easily.
Clone this wiki locally