-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Glottochronology #1435
Comments
Merged
vmonakhov
added a commit
that referenced
this issue
Jun 5, 2023
Merged
vmonakhov
added a commit
that referenced
this issue
Jun 22, 2023
vmonakhov
added a commit
that referenced
this issue
Jul 5, 2023
Changed the distance formula. The actual one is in the main text. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Task: To implement Glottochronology tool.
Theory and realization:
There is 100-words Swadesh list, it contains fundamental words on Russian to research any languages and calculate their relationship (aka. distance). The relationship based on etymological links between Swadesh words within each pair of dictionaries.
The result distance is calculated using the following formula:
distance = sqrt( ln(linked_words / total_words) / -0.1 / sqrt(linked_words / total_words) ), where:
Maximal distance by the formula above is 21.46, when linked_words/total_words == 1/100. Possible minimal distance is zero.
There is a hard-coded value distance == 25. It’s used when linked_words and/or total_words are zero. Large distance indicates weak relationship, little distance says about closeness of dictionaries and corresponding languages or dialects.
Result:
a) 2-d constellation, where dots are the dictionaries and distances between them indicate corresponding results by the formula.
b) 3-d constellation. It has the same meaning as 2-d one.
c) Table with each-to-each distances. It shows the calculated distances between corresponding dictionaries.
d) Table with cognates. It presents etymological groups by rows. Every value has the form:
Swadesh_word [phonological_transcription] original_translation_from_dictionary
An element of the table can have more than one such item (aka. synonyms) inside.
e) Table with single Swadesh words by dictionaries. This words have no cognates within the table (d).
f) A link to xlsx-file with the tables (c),(d),(e) in the corresponding worksheets.
Limitations:
About used limitations you can note at the bottom of the modal window. This can be:
g) Hidden tables. If the calculated output is too large, some tables can be hidden. The used limit is 1М html symbols for the tables summary size.
h) Not all the input dictionaries were processed. If an input dictionary has less than 50 Swadesh words, it’s not processed by the tool.
The text was updated successfully, but these errors were encountered: