-
Notifications
You must be signed in to change notification settings - Fork 0
/
stanbol-draft.html
109 lines (88 loc) · 5.71 KB
/
stanbol-draft.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>NLP Interchange Format (NIF) 2.0 - Core Specification</title>
<meta name="author" content="Sebastian Hellmann">
<link rel="stylesheet" href="css/main.css">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"/>
<script src="js/jquery-1.10.2.min.js"></script>
<script src="js/jquery-litelighter.js"></script>
<script src="js/jquery-litelighter-extra.js"></script>
<script>
$(document).ready(function(){
// all you need to get started
$('pre').litelighter({style:'dark'});
});
$(function(){
$("#about").load("http://persistence.uni-leipzig.org/nlp2rdf/specification/about.html");
});
</script>
</head>
<body>
<ol class="nav lang">
<li><a href="/nlp2rdf/">Back to Overview</a></li>
<li><a href="/nlp2rdf/specification/stanbol.html">latest published version</a></li>
<li><a href="/nlp2rdf/specification/stanbol/stanbol-draft.html">latest draft</a></li>
</ol>
<ol class="nav">
<li><a href="/nlp2rdf/specification/stanbol/stanbol-draft.html">v0.0.0-draft</a></li>
</ol>
<h2>NLP Interchange Format (NIF) 2.0</h2><h1>NIF Stanbol Profile Specification</h1><h2>Document Version 0.0.0-draft (not versioned yet)</h2>
<h2>Summary</h2>
<p>The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. NIF consists of specifications, ontologies and software (<a href="http://persistence.uni-leipzig.org/nlp2rdf">overview</a>), which are combined under the version identifier "NIF 2.0", but are <a href="http://persistence.uni-leipzig.org/nlp2rdf/specification/version.html">versioned individually</a>.</p>
<p>This document is an extension of the <a href="/nlp2rdf/specification/core/core-draft.html">NIF 2.0 Core Specification</a>. As specified in the core document, NIF has three profiles: <code>Simple</code>, <code>Stanbol</code> and <code>OA</code>. The name originates from the <a href="http://stanbol.apache.org">Apache Stanbol project</a> who use a very similar formalism for their NLP tool output. This specification is a mixture of this work with NIF Simple. </p>
Additional features provided by the NIF 2.0 Stanbol Profile:
<ul>
<li>Each annotation receives its own URI and can be annotated.</li>
<li>Alternative annotations (including confidence values).</li>
<li>Detailed provenance (each annotation is represented by an annotation URN that can keep provenance of the engine configuration).</li>
</ul>
<h2>Status</h2>
<p>While the NIF Simple Profile is already very advanced, the NIF Stanbol profile is not yet written. The reason for this is not lack of feasability or uncertainty, but mainly lack of time. So this draft document will constantly update during the next months. Normally, we also like to produce working software, before we write a detailed specification. As you will see below, the profile is a straightforward, no-brain extension of the NIF Simple Profile. Thanks to the great work of the Apache Stanbol project filling in the details is just a matter of convention and diligence. </p>
<h2>NIF 2.0 Stanbol Profile Example</h2>
<b>This is just a draft, but you can see that the vocab used in NIF Simple can be reused almost directly. </b>
<div class="example-code">
<pre class="code" data-lllanguage="generic">
nif:String rdfs:subclassOf nifs:Annotation .
<urn:uuid:context-aaa-aaa-aaa#char=0,40>
a nif:Context , nif:RFC5147String ;
nif:beginIndex "0" ;
nif:endIndex "40" ;
nif:isString "My favourite actress is Natalie Portman." ;
nif:referenceContext <urn:uuid:context-aaa-aaa-aaa#char=0,40> .
<urn:uuid:context-aaa-aaa-aaa#char=13,20>
a nif:Word , nif:RFC5147String ;
#// new classes for these annotations:
a nifs:TextAnnotation , nifs:Annotation ;
nif:anchorOf "actress" ;
nif:beginIndex "13" ;
nif:endIndex "20" ;
nif:referenceContext <urn:uuid:context-aaa-aaa-aaa#char=0,40> ;
#// even tokenization can have provenance and confidence
nifs:tokConfidence "0.99"^^xsd:decimal ;
nifs:tokAnnotatorsRef <urn:uuid:engine-deployment-time-bbb-bbb> .
#// provenance is kept
<urn:uuid:engine-deployment-time>
rdfs:comment "OpenNLP Tokenizer and Entity Recognition version, date, etc ....." ;
nifs:webserviceURL <http://nlp2rdf.lod2.eu/nif-ws.php> ;
nifs:extractionsDate "2013-09-22T15:08:44+02:00"^^xsd:dateTime .
<urn:uuid:annotation-bbb-bbb-bbb>
a nifs:Annotation .
nif:oliaCategory olia:CommonNoun , olia:Noun ;
nif:oliaLink <http://purl.org/olia/penn.owl#NN> ;
nif:oliaConf "0.99"^^xsd:decimal ;
nif:oliaAnnotatorsRef <urn:uuid:engine-deployment-time-bbb-bbb> ;
nifs:extractedFrom <urn:uuid:context-aaa-aaa-aaa#char=13,20> .
<urn:uuid:annotation-ccc-ccc-ccc>
a nifs:EntityAnnotation, nifs:Annotation .
itsrdf:taIdentRef <http://dbpedia.org/resource/Actor> ;
itsrdf:taAnnotatorsRef <urn:uuid:engine-deployment-time> ;
itsrdf:taConfidence "0.99"^^xsd:decimal ;
nifs:extractedFrom <urn:uuid:context-aaa-aaa-aaa#char=13,20> .
</pre></div>
<div id="about"></div>
<noscript><br><br><iframe border="0" src="http://persistence.uni-leipzig.org/nlp2rdf/specification/about.html" width="100%" height="800" seamless></iframe></noscript>
<img src="http://vg08.met.vgwort.de/na/71bd665cff4641fe9573c2982197ce65" width="1" height="1" alt="">
</body>
</html>