-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missing word exceptional forms #175
Comments
The *.exc files from the PWN 3.0 distribution were not used in our code to generate the RDF representation that we used (based on https://www.w3.org/TR/wordnet-rdf/). But I think we can easily add them to the OWN-EN RDF using an extra data property |
Documentation about those files:
|
While doing so, I learned that not all exceptions described in the .exc files could be mapped. For instances, It occurs because the target lemmas couldn't be found defined in the wordnet. In the examples, the forms |
We will be able to close this only after #177 referenced. |
In 7e54978 we fix that, considering new words with property Running and outputs: python3 pyownpt/cli/morpho_exceptions.py own-files/own-en-words.ttl WordNet-3.0/dict/ -o own-en-words.ttl -v
INFO:root:loading data from file 'openWordnet-PT/own-files/own-en-words.ttl'
INFO:root:loading data from file 'openWordnet-PT/WordNet-3.0/dict/adj.exc'
INFO:root:loading data from file 'openWordnet-PT/WordNet-3.0/dict/adv.exc'
INFO:root:loading data from file 'openWordnet-PT/WordNet-3.0/dict/noun.exc'
INFO:root:loading data from file 'openWordnet-PT/WordNet-3.0/dict/verb.exc'
INFO:ownpt:processing 1490 exceptions with pos 'a'
INFO:ownpt:processing 7 exceptions with pos 'r'
INFO:ownpt:processing 2054 exceptions with pos 'n'
INFO:ownpt:processing 2401 exceptions with pos 'v'
INFO:ownpt:action applied to 6053 cases
INFO:ownpt:action applied to 6053 cases
total: 4464 triples added
total: 4467 exceptions processed
total: 1586 exceptions not processed
INFO:ownpt:after action, 4464 triples were added
INFO:root:serializing output to 'own-en-words.ttl' |
The number of exceptions in the output is different from the previous comment? Can you list here one example of the result? I could not see the new file, but I am expecting that own-en and pwn-pt now have words like word-dog-v Is it right? How the exceptions were added? |
We have used so far
Considering the current one (first below), I think we can't use lexicalForm anymore because the exceptionalForm is a lexicalForm too. Besides that, exceptional means unusual or outstanding. it makes sense for the original PWN if all other regular inflections are considered the normal usual or normal ones. So
|
Just to make sure I got your inputs and we make a decision about the properties' names. |
Sure. It's important to have informative names to the properties. In https://wordnet.princeton.edu/documentation/wndb5wn, were the
I mean, although those exceptions are natural in the language, and should be understood simply as other forms, those cases are still exceptional in the morphological sense. This way of thinking may justify the description above. |
Let us use for now the LMF DTD as reference: wn30:lemma and wn30:altForm (from the https://www.w3.org/TR/swbp-skos-core-spec/#altLabel) to the exceptions. Does it work for you? If not, wn30:otherForm works? If so, I would be fine with any of those. |
Please data need to be fixed and the wn30.ttl vocabulary too. |
Yes. It was expected that after, in 156d2e1, considering parts-of-speech to add those morphological exceptions the quantity of exceptions not processed would be greater; or at least the same as before. In the first case, not considering pos, we had 1254 cases not applied. After considering pos, we had 1586 cases not applied. Checking, the new 332 cases because even if there is a Word with the suitable For instance: The exception |
Sure. In #177, we discuss expanding words, with a new property After that, comes the #175 (comment). We consider the property For instance: for the exception <https://w3id.org/own-pt/wn30-en/instances/word-zip-v> wn30:exceptionalForm "zipping"@en Another example: for the exception WARNING:ownpt:could not process exception:v: wildcatting wildcat |
We use sed "s/wn30:lexicalForm/wn30:lemma/g" -i wn30.ttl own-files/*
sed "s/wn30:exceptionalForm/wn30:otherForm/g" -i wn30.ttl own-files/* |
In https://github.com/bond-lab/omw-data/blob/9f2df85bbbab39370e265a2e2d90d95b6d015f04/wns/pwn30/wn30.xml.xz, one can find items
Form
describing irregular inflections of some words, such asramus
-rami
.Just reporting for now; this kind information isn't present here, and might be useful in the future.
The text was updated successfully, but these errors were encountered: