[WIP] First attempt at using Open Annotation Data Model #83

Treora · 2014-07-25T02:32:53Z

Annotations can now be serialised into JSON-LD formatted RDF, following
the data model spec: http://www.openannotation.org/spec/core/

Addresses #74

BigBlueHat · 2014-07-25T14:31:58Z

Woot! 😄

Can you paste an example of what the output JSON-LD created by this looks like and at what URL(s) it's available?

Thanks!

Treora · 2014-07-25T20:10:33Z

Here's an example annotation. Put the whole thing into the JSON-LD playground to see the data in different formats.

Note this is by no means the final version. I took the freedom to introduce Annotator's selector as http://annotatorjs.org/ns/TextRangeSelector, but this can of course be changed to anything.

The JSON-LD representation is not yet available at any URL. There are two questions here though: which URL(s) is it to be available, and which URI does the annotation itself have (preferably these overlap). The latter question seems more important to me and is not yet addressed in this first attempt. It probably depends on the particular application which URI they wish their annotations to have, and a base URI can be specified to achieve this. The default URI could be the URL that serves this individual annotation, e.g. /annotations/<id>. It would be nice to use the same URL for both the plain json annotation and for the json-ld OA format, as they are just different representations of the same resource. Content negotiation should be used to pick the right representation.

{
  "@context": {
    "cnt": "http://www.w3.org/2011/content#", 
    "prov": "http://www.w3.org/ns/prov#", 
    "annotator": "http://annotatorjs.org/ns/", 
    "oa": "http://www.w3.org/ns/oa#", 
    "dc": "http://purl.org/dc/elements/1.1/", 
    "dctypes": "http://purl.org/dc/dcmitype/", 
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", 
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  }, 
  "@id": "4zAZEVMpTvq3hQjp8vzBnA", 
  "@type": "oa:Annotation",
  "oa:hasBody": [
    {
      "cnt:chars": "bla bla blub", 
      "@type": "cnt:ContentAsText", 
      "dc:format": "text/plain"
    }
  ],
  "oa:hasTarget": {
    "oa:hasSource": {
      "@id": "http://localhost:4000/dev.html"
    }, 
    "oa:hasSelector": {
      "annotator:endContainer": "/ul[1]/li[1]", 
      "annotator:startOffset": 0, 
      "annotator:startContainer": "/ul[1]/li[1]", 
      "@type": "annotator:TextRangeSelector", 
      "annotator:endOffset": 26
    }, 
    "@type": "oa:SpecificResource"
  },
  "oa:annotatedBy": "alice",
  "oa:annotatedAt": {
    "@value": "2014-07-25T19:17:36.121460+00:00", 
    "@type": "xsd:dateTime"
  },
  "oa:serializedBy": {
    "foaf:name": "annotator-store", 
    "@id": "annotator:annotator-store", 
    "@type": "prov:Software-agent", 
    "foaf:homepage": {
      "@id": "http://annotatorjs.org"
    }
  }, 
  "oa:serializedAt": {
    "@value": "2014-07-25T19:17:36.121478+00:00", 
    "@type": "xsd:dateTime"
  },
  "oa:motivatedBy": [
    {
      "@id": "oa:commenting"
    }
  ]
}

Treora · 2014-07-28T22:52:53Z

A question now is how would we want the content negotiation to happen? It appears that there is no mimetype associated with JSON-LD "because it is just JSON". I do not agree with that decision, as it is a specific and well-defined usage of json, so there should be a mime type that uses the +json suffix [1]. Alas, application/rdf+json is already claimed by another rdf as json format [2].

An alternative to content negotiation is to simply have another URL to obtain the JSON-LD version, but that seems somewhat silly.

tilgovi · 2014-07-28T23:02:30Z

@Treora "application/ld+json"

http://www.w3.org/TR/json-ld/#interpreting-json-as-json-ld

tilgovi · 2014-07-28T23:02:55Z

I'm happy to accept a PR that adds the API without adding to store.py first. CONNEG and REST in another step.

tilgovi · 2014-07-28T23:05:56Z

The instance data should be moved to the top of the class (jsonld_namespaces, jsonld_baseuri). Even though I understand you did it to match the class and property names of the data model, Python isn't RDF or JSON and so I feel the properties should be snake_case.

Treora · 2014-07-28T23:11:27Z

"application/ld+json"

Thanks. I was using exactly that one for now, but I could not remember where I had taken it from. I already made the CONNEG and REST stuff so if it's okay I'll just include it in this pull request.

tilgovi · 2014-07-28T23:42:15Z

@Treora make it a separate PR. The discussion is cleaner that way. And perhaps @nickstenning and @azaroth42 can take a look at this in isolation and we can get it merged more quickly.

tilgovi · 2014-07-28T23:43:05Z

If we want to continue to support Python 2.6 we can also add a conditional test against the Python version when declaring the requirements in setup.py if it's desirable to run tests on Travis over the JSON-LD work.

azaroth42 · 2014-07-29T00:09:51Z

A few simplifications are possible:

Just point to the W3C context in @context; l72 could read:

context = "http://www.w3.org/ns/oa-context-20130208.json"

L120/121, L134/135: Not sure if there's some magic, or if there'll be just one @type here? Looks like a bug from first read.
L110 makes the assertion that tags are about the Target, not about the annotation. I'm happy with that ... please can it be reflected in the documentation?
L150/152 could be just motivations.append("oa:(name)ing") -- the context does the mapping to @id
L169 should add a new @context to define annotator: prefix. Then its only included when it's needed, not all the time (and makes the @context line at the top correct). See this thread for more info, though: http://lists.w3.org/Archives/Public/public-linked-json/2014Jul/0011.html
- L189 Semantify 👍
- L218 Context already defines xsd:dateTime so no need to do it here too. (Yes, the spec document for serialization doesn't list this... That would be a bug!)

Thanks for the work!!

azaroth42 · 2014-07-29T00:11:03Z

Regarding the contexts, you could also do:

context = ["http://www.w3.org/ns/oa-context-20130208.json", {"annotator" : "http://annotatorjs.org/ns/"}]

And then ignore the comment about L169

Treora · 2014-07-29T03:43:42Z

Thanks for the comments. I did not know w3c published the OA JSON-LD context, that's a nice service, makes our code and JSON-LD documents cleaner.

tilgovi · 2014-07-29T09:20:24Z

annotator/annotation.py

@@ -40,6 +52,8 @@ class Annotation(es.Model):
    __type__ = TYPE
    __mapping__ = MAPPING

+    jsonld_baseurl = ''


Feels better to leave this out to me. Otherwise we might end up setting a base=''. I don't know what JSON-LD processors are supposed to do with that. Let the caller add it to the context if necessary, I think.

I think it's nice to have some default setting for it, so it's immediately obvious that it can be set, there's a place for its documentation, it shows up in tab completion in IDEs, etcetera. Setting it to None seems to be the elegant approach here.

Then you have to guard the setting so you don't have null in the JSON there.
On Jul 29, 2014 4:09 PM, "Gerben" [email protected] wrote:

In annotator/annotation.py:

@@ -40,6 +52,8 @@ class Annotation(es.Model):
type = TYPE
mapping = MAPPING

jsonld_baseurl = ''

I think it's nice to have some default setting for it, so it's immediately
obvious that it can be set, there's a place for its documentation, it shows
up in tab completion in IDEs, etcetera. Setting it to None seems to be the
elegant approach here.

—
Reply to this email directly or view it on GitHub
https://github.com/openannotation/annotator-store/pull/83/files#r15558751
.

Adding the @base was already conditional on the trueness of jsonld_baseurl. (so in fact it would not even end up setting @base="" by default like you feared)

Okay. Seems fine but unnecessary.

tilgovi · 2014-07-29T09:20:32Z

Looking really good!

nickstenning · 2014-08-09T11:02:12Z

I'm very happy to defer to the experts (@azaroth42) on the subject of whether the serialization is good. It certainly looks very sensible to me.

That said, from a programming point of view, I think this should probably not be implemented as a property on Annotation, but as a separate concrete implementation of a Renderer interface, something like:

class OARenderer(object):
    def render(self, annotation):
        out = OrderedDict()
        out['@context'] = self.context
        out['@id'] = annotation.id
        out['@type'] = 'oa:Annotation'
        out['hasBody'] = self._has_body(annotation)
        ...
        return out

This will make it much easier to test, and avoid the problem of having numerous methods on the Annotation class which are nothing about annotations, per se, but about a specific representation of an annotation. As things currently stand in this PR, I think most programmers would be a bit surprised to find that the tags property returns a list of oa:Tag dicts.

Treora · 2014-08-10T01:56:44Z

From a programming point of view there could be many improvements, for example using something like rdflib and pyld could result in more flexible code than this hand-crafted JSON-LD. I just started with an easy first step.
A question is whether an RDF representation is considered just as an output format, or if it is to become the native format at some moment, in which case it would make some sense to keep these methods in the Annotation class.

I think that, to achieve the vision of a distributed annotation network on the web, I would like to have some sort of graph-based, semantic-web-minded storage system at some moment, rather than using Elasticsearch as a primary database with a non-standard data format that can be presented as OA JSON-LD when requested. However, this idea is so far away from the current annotator-store, which is at its core an Elasticsearch wrapper plus a REST API, that it probably makes more sense to start such a system from scratch.

I guess that for annotator-store it would be okay to just have a JSON-LD renderer, as you suggested, and consider it a nice compatibility extra, so OA-based systems can read from it.

tilgovi · 2014-08-10T02:32:02Z

Agreed on all counts, Gerben. I do think there's something to keeping the
separation between model and renderer, though. Not sure how that should
look.
On Aug 9, 2014 6:56 PM, "Gerben" [email protected] wrote:

From a programming point of view there could be many improvements, for
example using something like rdflib and pyld could result in more
flexible code than this hand-crafted JSON-LD. I just started with an easy
first step.
A question is whether an RDF representation is considered just as an
output format, or if it is to become the native format at some moment, in
which case it would make some sense to keep these methods in the Annotation
class.

I think that, to achieve the vision of a distributed annotation network on
the web, I would like to have some sort of graph-based, semantic-web-minded
storage system at some moment, rather than using Elasticsearch as a primary
database with a non-standard data format that can be presented as OA
JSON-LD when requested. However, this idea is so far away from the current
annotator-store, which is at its core an Elasticsearch wrapper plus a REST
API, that it probably makes more sense to start such a system from scratch.

I guess that for annotator-store it would be okay to just have a JSON-LD
renderer, as you suggested, and consider it a nice compatibility extra, so
OA-based systems can read from it.

—
Reply to this email directly or view it on GitHub
#83 (comment)
.

nickstenning · 2014-08-11T12:21:56Z

Does the code need access to any private information about the model instance in order to render JSON-LD? If not, it probably shouldn't be part of the model class, for a number of different reasons (testability, readability, security to name a few).

More generally, given that a model object has a potentially limitless number of representations, I think it's clear that we couldn't keep adding representation code into the model.

But I'm totally agreed that there need to be some pretty major changes to the annotator-store if it's to be more widely used, and it's worth noting that this repository was originally meant to be a reference implementation of the storage API, not a production-ready storage application.

I'm already firmly of the opinion that using ElasticSearch as a persistence layer is ridiculous, and am hoping to find time to change that soon. That said, the direction I want to move in is a database (AKA postgres) and not a triplestore.

tilgovi · 2014-08-11T14:24:02Z

+1
On Aug 11, 2014 5:22 AM, "Nick Stenning" [email protected] wrote:

Does the code need access to any private information about the model
instance in order to render JSON-LD? If not, it probably shouldn't be part
of the model class, for a number of different reasons (testability,
readability, security to name a few).

More generally, given that a model object has a potentially limitless
number of representations, I think it's fairly obvious that we couldn't
keep adding representation code into the model.

But I'm totally agreed that there need to be some pretty major changes to
the annotator-store if it's to be more widely used, and it's worth noting
that this repository was originally meant to be a reference implementation
of the storage API, not a production-ready storage application.

I'm already firmly of the opinion that using ElasticSearch as a
persistence layer is ridiculous, and am hoping to find time to change that
soon. That said, the direction I want to move in is a database (AKA
postgres) and not a triplestore.

—
Reply to this email directly or view it on GitHub
#83 (comment)
.

azaroth42 · 2014-08-11T15:49:21Z

No complaints from me, about any of it :) OA is intended for
interoperability between systems, not necessarily an internal model.
Given some of the graph boundary issues, I think that some sort of record
store (postgres, mongo, whatever) will be easier for everything other than
graph specific queries. And they could be more easily done by building
just the graph that's needed rather than maintaining all of the information
in a triplestore just in case.

BigBlueHat · 2014-08-11T15:58:27Z

To @nickstenning's point about this repo being a reference implementation, I'd like to highlight the following--and perhaps move this discussion elsewhere soon. 😃

Despite what the README currently states, this repo is actually an implementation of the Store API expected by the Annotator.Store plugin. There are other implementations of the Store API, but there could also be other storage-focused plugins that traffic in completely different formats.

Perhaps a good way forward on all these fronts is:

Tighten up relationship and docs between the Annotator.Store plugin and this repo.
(Re)curate a list of Store API implementations--I've found a few lists, and have an implementation of my own that I've not added to any of them...yet! 😄
Encourage (perhaps) the creation of more Annotator plugins to work with other storage endpoints: Linked Data Profile, PouchDB, and AtomPub seem like fun options. 😉

Regardless, we've got a solid JSON-LD representation built by @Treora and I'm excited to see it in use. 😸

nickstenning · 2014-08-11T17:23:19Z

There are other implementations of the Store API, but there could also be other storage-focused plugins that traffic in completely different formats.

Yes! And there will need to be, because this naive "REST" API is not going to work for shipping annotations between providers in the bold new world of interoperable annotations we're aiming for. I'm looking forward to discussing this with you in more detail.

azaroth42 · 2014-08-11T17:24:56Z

Discussing ... on the Open Annotation community group list? :)

nickstenning · 2014-08-11T17:25:50Z

Why yes, @azaroth42. You read my mind!

azaroth42 · 2014-08-11T17:26:48Z

👍 🍻

nickstenning · 2014-08-11T17:29:09Z

So, just in case I came across negatively -- I think this is great, @Treora! I'll happily merge this (and the attendant code to expose this repr in the API) if the object creation moves into a separate annotation renderer class with its own tests.

Treora · 2014-08-11T19:10:25Z

No worries, you don't sound negative. I was not aware it was originally intended as a reference implementation. If you wish to keep it more like that, it's totally okay with me to keep it separate (I could put it in Hypothes.is instead, so my effort would not be useless). On the other hand, it's perhaps also nice if there's a reference implementation for converting annotator's annotations to OA.

Annotations can now be serialised into JSON-LD formatted RDF, following the data model spec: http://www.openannotation.org/spec/core/

@context

Because JSON-LD spec recommends keeping @context at the top.

Let's keep the default very generic, for example we don't even know if users are Persons. Implementors can subclass Annotation to add more user info (as with any of the properties).

gergely-ujvari · 2015-03-07T10:53:23Z

Closing in favor of #119

BigBlueHat mentioned this pull request Jul 28, 2014

Feature request: Howto annotate to a database hypothesis/h#1331

Closed

tilgovi reviewed Jul 29, 2014
View reviewed changes

Treora added 2 commits August 11, 2014 15:53

First attempt at using Open Annotation Data Model

ed7f25a

Annotations can now be serialised into JSON-LD formatted RDF, following the data model spec: http://www.openannotation.org/spec/core/

Use OrderedDict for OA representation

5f7d0be

Because JSON-LD spec recommends keeping @context at the top.

Treora added 7 commits August 11, 2014 15:54

Better checks for absent fields in creating jsonld

06336d4

Fall back to dict if OrderedDict unavailable

7cbbc0b

Code reordering and renaming

e59f3eb

Small edits&fixes

8c3be44

Use OA context from w3.org

6b53f0e

Set jsonld_baseurl default to None

2229510

Semantify user

ff2af84

Let's keep the default very generic, for example we don't even know if users are Persons. Implementors can subclass Annotation to add more user info (as with any of the properties).

Treora mentioned this pull request Aug 12, 2014

[WIP] Make store return JSON-LD when preferred #90

Open

Move json-ld attributes into openannotation.py

82f1d08

Treora mentioned this pull request Aug 20, 2014

annotator-store release v0.12 #94

Closed

5 tasks

gergely-ujvari mentioned this pull request Mar 7, 2015

First attempt at using Open Annotation Data Model v2 #119

Open

gergely-ujvari closed this Mar 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] First attempt at using Open Annotation Data Model #83

[WIP] First attempt at using Open Annotation Data Model #83

Treora commented Jul 25, 2014

BigBlueHat commented Jul 25, 2014

Treora commented Jul 25, 2014

Treora commented Jul 28, 2014

tilgovi commented Jul 28, 2014

tilgovi commented Jul 28, 2014

tilgovi commented Jul 28, 2014

Treora commented Jul 28, 2014

tilgovi commented Jul 28, 2014

tilgovi commented Jul 28, 2014

azaroth42 commented Jul 29, 2014

azaroth42 commented Jul 29, 2014

Treora commented Jul 29, 2014

tilgovi Jul 29, 2014

Treora Jul 29, 2014

tilgovi Jul 29, 2014

Treora Jul 29, 2014

tilgovi Jul 29, 2014

tilgovi commented Jul 29, 2014

nickstenning commented Aug 9, 2014

Treora commented Aug 10, 2014

tilgovi commented Aug 10, 2014

nickstenning commented Aug 11, 2014

tilgovi commented Aug 11, 2014

azaroth42 commented Aug 11, 2014

BigBlueHat commented Aug 11, 2014

nickstenning commented Aug 11, 2014

azaroth42 commented Aug 11, 2014

nickstenning commented Aug 11, 2014

azaroth42 commented Aug 11, 2014

nickstenning commented Aug 11, 2014

Treora commented Aug 11, 2014

gergely-ujvari commented Mar 7, 2015

[WIP] First attempt at using Open Annotation Data Model #83

[WIP] First attempt at using Open Annotation Data Model #83

Conversation

Treora commented Jul 25, 2014

BigBlueHat commented Jul 25, 2014

Treora commented Jul 25, 2014

Treora commented Jul 28, 2014

tilgovi commented Jul 28, 2014

tilgovi commented Jul 28, 2014

tilgovi commented Jul 28, 2014

Treora commented Jul 28, 2014

tilgovi commented Jul 28, 2014

tilgovi commented Jul 28, 2014

azaroth42 commented Jul 29, 2014

azaroth42 commented Jul 29, 2014

Treora commented Jul 29, 2014

tilgovi Jul 29, 2014

Choose a reason for hiding this comment

Treora Jul 29, 2014

Choose a reason for hiding this comment

tilgovi Jul 29, 2014

Choose a reason for hiding this comment

Treora Jul 29, 2014

Choose a reason for hiding this comment

tilgovi Jul 29, 2014

Choose a reason for hiding this comment

tilgovi commented Jul 29, 2014

nickstenning commented Aug 9, 2014

Treora commented Aug 10, 2014

tilgovi commented Aug 10, 2014

nickstenning commented Aug 11, 2014

tilgovi commented Aug 11, 2014

azaroth42 commented Aug 11, 2014

BigBlueHat commented Aug 11, 2014

nickstenning commented Aug 11, 2014

azaroth42 commented Aug 11, 2014

nickstenning commented Aug 11, 2014

azaroth42 commented Aug 11, 2014

nickstenning commented Aug 11, 2014

Treora commented Aug 11, 2014

gergely-ujvari commented Mar 7, 2015