Skip to content

Commit

Permalink
More on ontologies + wordnet
Browse files Browse the repository at this point in the history
  • Loading branch information
inariksit committed Jun 18, 2020
1 parent 155f943 commit 0e0aa9a
Show file tree
Hide file tree
Showing 4 changed files with 128 additions and 4 deletions.
2 changes: 1 addition & 1 deletion legal_ontology.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Say that you write a law that says

> You have the obligation to pay taxes.
Maybe you link your law to [WordNet](https://wordnet.princeton.edu/) and make sure you have the right sense of "you", "have", "obligation", "pay" and "taxes". Now you can translate it into any other language that has a [linked WordNet](https://github.com/GrammaticalFramework/gf-wordnet#readme) with the same identifiers.
Maybe you link your law to <wordnet> and make sure you have the right sense of "you", "have", "obligation", "pay" and "taxes". Now you can translate it into any other language that has a [linked WordNet](https://github.com/GrammaticalFramework/gf-wordnet#readme) with the same identifiers.

Now you know that _obligation_ in the sense `06785951` 'a legal agreement specifying a payment or action and the penalty for failure to comply' is translated into Bulgarian as _обвързаност_.

Expand Down
77 changes: 74 additions & 3 deletions ontology.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,79 @@ date: "2020-06-17"

Collection of concepts and their relationships, in a machine-readable format.

Kind of like this zettelkasten: here we link concepts to each other. In addition, the _links_ themselves may have types and be the subject or object in another relation, linked by something that itself has a type and can be a subject/object of yet another relation, and so on.
<sumo> is a big, well-known general ontology.

Contrast with a web of free text, this kind of structure allows for e.g. automated question answering.
### Taxonomy

(That is just one kind of ontology, there are other designs. I don't know this area very well. But other ontologies seem to contain _axioms_ or _facts_.)
The basic building block of an ontology is a hierarchy of concepts. Higher nodes represent general concepts, lower nodes more specific. Example:

Entity
/ | \
Biology … Geography
/ | | \ … … / | | \
… … … … … … … … … … … … …
| \
Neurology Europe
/ | | \ / | \
… … … … … … … … … … … … … … …
| |
Optic nerve Heathrow airport


Just a tree of arbitrary labels isn't particularly useful. That's why ontologies may have some of the following features.

### Relationships

So far we've seen only taxonomy membership in a strict tree structure.
* Spice Girls `is-a` Band
* Wannabe `is-a` Song

But we can also have other _relationships_.
* Spice Girls `perform` Wannabe
* Wannabe `year` 1996

Furthermore, the relation links themselves can be the subject or object in another relation.
* Perform `is-a` Action
* is-a `is-a` Relation

See [RDF](https://en.wikipedia.org/wiki/Resource_Description_Framework), a data model based on such triples.

### Axioms
TODO

### Mapping to logic
TODO

### Mapping to lexical resources

[Niles and Pease (2003)](http://www.adampease.org/professional/Niles-IKE.pdf) map mid-level entries from <sumo> to <wordnet>. WordNet itself has some relations like synonymy and hypernymy, and I'm not quite sure how they work together with the relations of an ontology (TODO: read the whole 2003 paper).

Main point is that a concept in an ontology corresponds to one or more synonym sets in WordNet. Consider a corner of ontology like the following:

Entity
/ \
Abstract Physical
/ | | \ / | \
Attribute … … … …
/ | \
Measure
/ \
TimeMeasure LengthMeasure


The concept `Measure` is mapped to a number of WordNet synonym sets, such as
* _space_ 00014887 '3-dimensional expanse in which everything is located'
* _time_ 15160774 'the fourth coordinate that is required (along with three spatial dimensions) to specify a physical event'

And `LengthMeasure` is mapped only to 00014887 _space_.


## Ontology extraction

Quote from [Herbelot, 2011](https://web.archive.org/web/20130704143830/http://www.peerpress.de/discoursecpp.pdf)

> [O]ntology extraction — a subfield of natural language processing which, put simply, specialises in producing lists. [] Well-loved ontology extraction tasks include the retrieval of Oscar nominees, chemical reactions and dead presidents. In this kind of research, the machine is asked, for instance, to produce a list of things that are ‘like lorries’ and is expected to duly return (given the current state of the art)
>
> `truck car motorcycle plane engine hamster.`
>
> Because lorries have wheels and hamsters have too.
29 changes: 29 additions & 0 deletions sumo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
date: "2020-06-18"
---

# SUMO

SUMO (Suggested Upper-Merged Ontology) has the approach of _domain_ ontologies and _merge_ ontologies. Example:

* Top level
```
Entity
/ \
Abstract Physical
/ | | \ / | \
Attribute … … Object … Process
```
* Domain ontology
```
AirportsFromAtoK
/ | \
<fine-grained distinctions>
/ / |  \ \
Arlanda … … Heathrow …
```

All entries in SUMO are part of a single tree, starting from `Entity`. A domain ontology about airports is linked to the top level ontology, in a distant subtree of physical entities.

TODO: I don't know how the linking works in practice, or if the technical details have any relevance to the scope of CCLAW readings. ([Niles, Pease (2001)](https://dl.acm.org/doi/pdf/10.1145/505168.505170) seems to describe the merging process, read later if interested.)
24 changes: 24 additions & 0 deletions wordnet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
date: "2020-06-18"
---

# WordNet

[Princeton WordNet](https://wordnet.princeton.edu/) is a lexical database consisting of _synonym sets_ (synset). Each synset has
- id
- part-of-speech (noun, verb or adjective)
- definition
- example(s) of use

The word "space" belongs to the following synsets.

* 00028950-__n__ _the unlimited expanse in which everything is located; "they tested his ability to locate objects in space"_
* 13933399-__n__ _an empty area (usually bounded in some way between things); "the architect left space in front of the building"; "they stopped at an open space in the jungle"; "the space between his teeth"_
* 08670545-__n__ _an area reserved for some particular purpose; "the laboratory's floor space"_
* 08517454-__n__ _[astrology] any location outside the Earth's atmosphere; "the astronauts walked in outer space without a tether"; "the first major milestone in space exploration was in 1957, when the USSR's Sputnik 1 orbited the Earth"_
* 06852240-__n__ _[linguistics, publishing] a blank character used to separate successive words in writing or printing; "he said the space is the most important character in the alphabet"_
* 15197259-__n__ _[time period] the interval between two times; "it all happened in the space of 10 minutes"_
* 06401196-__n__ _a blank area; "write your name in the space provided"
* 06875252-__n__ _[music] one of the areas between or below or above the lines of a musical staff; "the spaces are the notes F-A-C-E"_
* 04037131-__n__ _[publishing] a block of type without a raised letter; used for spacing between words or sentences_
* 01992094-__v__ _place at intervals; "Space the interviews so that you have some time between the different candidates"_

0 comments on commit 0e0aa9a

Please sign in to comment.