-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand Data Deposit API to support additional metadata #899
Comments
Original Redmine Comment This would line up appropriately with the metadata expansion that we are actively working on for Dataverse 4.0. |
Original Redmine Comment Eleni Castro wrote:
|
Original Redmine Comment See also the discussion I kicked off here: [sword-app-tech] client SHOULD add Dublin Core terms to the Atom Entry, MAY add any other metadata formats or foreign markup - http://www.mail-archive.com/[email protected]/msg00384.html I'd rather focus effort on our "native" API in 4.0, however, for supporting more metadata. It's already working. Docs at https://github.com/IQSS/dataverse/tree/master/scripts/api |
Will start with the Ubiquity Press Datasets as an example for what metadata fields we should extend support for in version 1.1 of the plugin. https://docs.google.com/document/d/1CRGw4nbOS0ccynJdq0Am-7I9uazt-dsxkO6dWMpn-ts/edit?usp=sharing Will eventually extend support to other metadata schemas (DDI, DataCite, etc) but the SWORD plugin may not be used for this but instead use the native JSON API. cc/ @pdurbin |
@axfelix @jwhitney @pdurbin Here is a sample atom-xml file that i put together based on this dataset:
|
This looks sensible enough -- the affiliation="" element of dcterms:creator is no worse than the hack we already made to isReferencedBy and I think it's fair to say that we're still not overloading it. Thanks! |
The plugin could provide Affiliation as you have it in the example (dept., org., country) is split across several optional fields. Block text would need to be reformatted into a phrase & HTML stripped, but it's possible to provide a reasonable value. I'm not sure what's the best way to provide a description or date that describes the dataset rather than the article. Right now, the plugin maps article metadata to the dataset, then suppfile metadata is added to dataset fields that allow multiple values, like keyword. So if an OJS article has more than one suppfile in a dataset, suppfile-level keywords are combined & mapped up to the dataset, but that's not going to work well for fields that expect a single value, like @axfelix, any thoughts? Re-thinking the mapping (e.g.., create one dataset for each suppfile) means multiple data citations / article (which is maybe ok, although seems excessive) but would provide finer control over what shows up on the Dataverse side. |
@jwhitney @axfelix Newbie question: For "Date Available" (when a dataset is published) is this information system generated when you send the command to Dataverse to Release the dataset? About the one dataset for each suppfile suggestion: this would be problematic on our end since we would ultimately want all files that belong within a single dataset to be included together (one data citation). Is it possible to aggregate some of the metadata from individual supp files being uploaded or make it that the author is filling in metadata at the entire dataset-level rather than file-by-file which we do not yet have a way to index/store this information in Dataverse? I imagine most people would fill in any relevant dataset related information the first time they add a supp file? Not sure how this would work in your system though. |
If you're going to be fairly conservative in the number of dataset-level fields, we could look at handling the description the same way as external data citations. The field's presented in the suppfile form, but is stored in article metadata, so the same value's shared across suppfiles. |
I was afraid of suggesting that for the amount of additional logic it'd take to show the same field (and pre-populate it with whatever was already entered?) on repeat suppfile uploads, but if you want to take that route, it's fine with me! |
That's lots of time: adding metadata is fairly straightforward. Do you want to drop article abstract from dataset metadata altogether, in favour of the desc. field to be added to the suppfile form? Or use abstract as a default if dataset desc. not provided? |
I prefer your second option:
|
Alex, I agree completely - we should not create one dataset per file.
|
Yeah, I agree with Eleni -- use abstract if dataset description not provided. |
This actually seems nicely in line with "New affiliation attributes for Creator and Contributor" in the just-released DataCite Metadata Schema Version 3.1: http://www.datacite.org/node/141 It's like @posixeleni is psychic. :) I wonder if there's anything else in there we should consider. |
Spoke @pdurbin and clarified that the final requirement for this ticket is that we modify the atom xml for two elements:
|
Right, and to be clear, we're no longer making any changes to dates. We used some |
Nice to see they're being proactive about changes to the DataCite standard at this point. Nothing else seems immediately worth adopting, but still good. |
As of 5257f0d we now support adding authorAffiliation, contributorName, and contributorType via SWORD via elements like this:
@kcondon to test you'll first need to run these SQL statements...
(or drop the database and set up again) ... then, if you run https://github.com/IQSS/dataverse/blob/master/scripts/api/data-deposit/create-dataset-899-expansion from the root of the repo (operates on https://github.com/IQSS/dataverse/blob/master/scripts/api/data-deposit/data/atom-entry-study-899-expansion.xml ) you should end up with a dataset like the one below. Unfortunately, it is expected that the controlled vocabulary of contributorType (Editor, Funder, Researcher, etc. is not enforced but we will fix this in #973. @posixeleni I left a reminder in the SWORD backward compatibility doc that we need to document this new functionality: "FIXME: Document via example (in XML) how we now support authorAffiliation, contributorName, and contributorType as of #899." |
@esotiri or @kcondon after you have tested this on dvn-build can you please do a build on https://apitest.dataverse.org so @jwhitney can test? This was mentioned at by @posixeleni at [pkp-dataverse-integration] Email Your Updates for PKP-OJS project. Thanks! |
the latest code is in https://apitest.dataverse.org for @jwhitney to test. |
Thanks! Testing now. |
- dcterms:available never came to fruition, see #899 - also multiple dcterms:coverage works fine now, enabling
dataset created with atom-entry-study-899-expansion.xml |
issue resolved |
This is the expected error if you try to use SWORD with a non-existent XML input file. Please see #893 (comment) for details. |
Author Name: Eleni Castro (@posixeleni)
Original Redmine Issue: 3425, https://redmine.hmdc.harvard.edu/issues/3425
Original Date: 2014-01-22
So far, according to the OJS Dataverse plugin testers surveyed with results recorded at https://docs.google.com/spreadsheet/ccc?key=0AjeLxEN77UZodDJyd0pZdnlDZ3I5eWxnOHBmV1Q4dHc&usp=sharing the most commonly requested feature is the ability to customize which metadata fields are available as part of the data deposit form, which should be implemented in a future version. In order to support this, we will need to expand the API's metadata support beyond Dublin Core metadata. SWORD Protocol should be flexible enough for us to use other standards like DDI. At http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#protocoloperations_creatingresource_entry the SWORDv2 spec says (emphasis added):
We interpret this to mean that in addition to Dublin Core (dcterms, specifically), the SWORD spec is flexible enough to support wildly different metadata formats such as DataCite (https://www.datacite.org ), DDI (Data Documentation Initiative: http://www.ddialliance.org ), VO (Virtual Observatory: http://www.ivoa.net/documents/latest/RM.html ) ISA-Tab (Investigation, Study, and Assay in XML format: http://isatab.sourceforge.net/docs/Wiemann_SupplFile4.xml ), etc.
We're not sure if any other SWORD server implementation is going beyond dcterms, however, which is what the spec requires. We'll ask on the mailing list.
The text was updated successfully, but these errors were encountered: