Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema.datacommons.org: issues noted at (pre-alpha) launch #4

Open
danbri opened this issue Oct 4, 2018 · 0 comments
Open

schema.datacommons.org: issues noted at (pre-alpha) launch #4

danbri opened this issue Oct 4, 2018 · 0 comments
Assignees
Labels
schemas schema.datacommons.org - vocabulary

Comments

@danbri
Copy link
Contributor

danbri commented Oct 4, 2018

This meta issue tracks known issues for the https://schema.datacommons.org/ vocabulary and its relationship with the dataCommons Knowledge Base (dCKB).

Technical issues relating to the underlying site software are better recorded at schema.org; we are using an early version of the schema.org code designed to support external schema extensions.

These initial schemas, and the associated data model, should be considered pre-alpha.

  • We need to clarify how and when to use different properties for the same underlying concept, for example if different datasets had related but significantly different enumerations.
  • the underlying KG has a rich notion of provenance for every data item, but this could map to graph vocabulary (and APIs) in several ways. In terms of vocabulary we can use constructs such as "sub-property" and dataset-specific enumerations, and are looking at how best to apply Schema.org's notion of an "external enumeration", as well as looking at other relevant approaches (e.g. StatDCAT, SPARQL named graphs, Data Cube, PROV, etc.). W3C's CSVW, as well as Wikidata's data model and its mapping to their query service, are also both relevant.
  • Need to clarify how to handle documentation for schema.org terms where dataCommons is re-using rather than introducing new terms, but is nevertheless adding significant additional constraints, usage guidance or where there are dataset-related code lists, footnotes/caveats, etc.
  • Although this vocabulary is not being proposed for general schema.org-like (markup) use, it would help to show more examples in the site.
  • Need style guideline / syntax rules for dataset-qualified terms, especially enumerations.
    • Syntactically "/" is appealing (e.g. "/USC/35To44Years" but "/" in term URLs affects browser behaviour when loading relative links ("docs/foo", for CSS, JS etc.). There may be mismatches with the naming style from the dcKG graph, which can initially be resolved with a 404 handler but need eventually to be alligned.
  • These schema definitions need to be imported into the dcKG graph itself (including some portion or all of latest Schema.org)
@danbri danbri added the schemas schema.datacommons.org - vocabulary label Oct 4, 2018
@danbri danbri self-assigned this Oct 4, 2018
danbri added a commit that referenced this issue Oct 5, 2018
As noted in #4 these should be considered pre-alpha.

There is need for QA, cross-referencing to supporting documentation. In
addition, we may declare dataset-specific (sub-)properties to be clearer
about the distinction between the various official codelists and the
underlying concepts that they communicate.
danbri pushed a commit that referenced this issue Oct 24, 2018
Added documentation of missing properties.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schemas schema.datacommons.org - vocabulary
Projects
None yet
Development

No branches or pull requests

1 participant