Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File metadata is not exported #27

Open
dgarijo opened this issue Aug 3, 2019 · 5 comments
Open

File metadata is not exported #27

dgarijo opened this issue Aug 3, 2019 · 5 comments

Comments

@dgarijo
Copy link
Contributor

dgarijo commented Aug 3, 2019

Is your feature request related to a problem? Please describe.
Metadata fields used for data in executions are not exported in the provenance export. This means that any custom metadata added in a file, such as the data catalog dataset id, will not not be exported.

Describe the solution you'd like
I would like the custom metadata to be exported.

@mosoriob
Copy link
Contributor

mosoriob commented Aug 9, 2019

A DataFile has zero or more metadata fields

select ?metadata ?value { 
  <http://localhost:8080/wings_portal/export/users/admin/CaesarCypher/data/library.owl#sarah.txt>  ?metadata ?value .
  ?metadata <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>   <http://www.w3.org/2002/07/owl#DatatypeProperty> ;
 <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.wings-workflows.org/ontology/data.owl#hasDataMetrics>  
}
------------------------------------------------------------------------------------------------------------------
| metadata                                                                                           | value     |
==================================================================================================================
| <http://localhost:8080/wings_portal/export/users/admin/CaesarCypher/data/ontology.owl#hasSize>     | 13        |
| <http://localhost:8080/wings_portal/export/users/admin/CaesarCypher/data/ontology.owl#hasLanguage> | "english" |
------------------------------------------------------------------------------------------------------------------

Thus, I propose:

<https://www.opmw.org/export/resource/WorkflowExecutionArtifact/_id_> 
         <https://www.opmw.org/ontology/hasMetadata> <metadauri>

<metadata_uri>  <https://www.opmw.org/ontology/hasValue>         "15.0"^^xsd:decimal .
<metadata_uri>  <https://www.opmw.org/ontology/hasKey>           "hasSize"^^xsd:string .
      

The metadata URI will be:

https://www.opmw.org/page/export/resource/Metadata/CaesarCypherMapR-a7839f81-cb3d-49ea-8490-4ef75d18e97d_v1_hasSize
                String metadataVariableURI = PREFIX_EXPORT_RESOURCE + Constants.CONCEPT_METADATA_VARIABLE + "/" +
                        concreteTemplateExport.getTransformedTemplateIndividual().getLocalName() + "_" + variable.getLocalName();

A bad idea:

The provenance can be:

<https://www.opmw.org/export/resource/WorkflowExecutionArtifact/_id_> 
         <https://w3id.org/wings/export/CaesarCypher/DatatypeProperty#hasSize> 
          "15.0"^^xsd:decimal .

However, the problem of this approach is the property hasSize because it contains the Domain variable. So, create a query to obtain a metadata field is hard.

@dgarijo
Copy link
Contributor Author

dgarijo commented Aug 9, 2019

@sirspock Why is
<https://www.opmw.org/export/resource/WorkflowExecutionArtifact/_id_> <https://w3id.org/wings/export/CaesarCypher/DatatypeProperty#hasSize> "15.0"^^xsd:decimal .
A bad idea? I think it's an excellent idea. In fact, I would do
<https://www.opmw.org/export/resource/WorkflowExecutionArtifact/_id_> <https://w3id.org/wings/export/_domainName_/extension#_originalPropertyName> "original_property_value"^^type.

@dgarijo
Copy link
Contributor Author

dgarijo commented Aug 9, 2019

And then you just have to iterate over all the dcdom metadata properties.
The reason why I think it's not a good idea to add the property to opmw is because they can be different on each domain (they can be created by users), so I suggest using the dynamic NS we set up

@mosoriob
Copy link
Contributor

mosoriob commented Aug 9, 2019

@sirspock Why is
<https://www.opmw.org/export/resource/WorkflowExecutionArtifact/_id_> <https://w3id.org/wings/export/CaesarCypher/DatatypeProperty#hasSize> "15.0"^^xsd:decimal .
A bad idea? I think it's an excellent idea. In fact, I would do
<https://www.opmw.org/export/resource/WorkflowExecutionArtifact/_id_> <https://w3id.org/wings/export/_domainName_/extension#_originalPropertyName> "original_property_value"^^type.

I see two issues:

  1. How I can get all the metadata of an execution?
  2. To get a metadata property (e.g., the data catalog id), I need to know the domain. :/

@dgarijo
Copy link
Contributor Author

dgarijo commented Aug 9, 2019

  1. You don't need all metadata from an execution. You have to do it by file.
  2. Yes, that is inconvenient. But you see, different domains may have the same metadata field meaning different things. What you may have used as a geographical location someone else may have used as a path location. So you need the domain, right?

Since we need a solution for the data catalog id, then let's have a unique property called "dataCatalogId" in opmw namespace. But that's the only one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants