Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

additional values ( for several codelists) #5

Open
pebau opened this issue Mar 27, 2023 · 14 comments
Open

additional values ( for several codelists) #5

pebau opened this issue Mar 27, 2023 · 14 comments
Labels
codelist request request for codelist update

Comments

@pebau
Copy link

pebau commented Mar 27, 2023

Have tried the request form just to study, I'd kindly ask to add these options for rasdaman datacubes:

  • Main category: Analytics
  • Objective: Analytics
  • Framework: rasdaman
  • Architecture: datacubes
  • Algorithm: WCPS
  • Input data: no idea what to add here for the datacubes
  • Output data: likewise
  • Conditions for access and use: not so easy to say because each dataset comes with its individual additional constraints
@sMorrone
Copy link
Contributor

Hi @pebau I am moving this post to resorce-metadata repository for consistency

@sMorrone sMorrone transferred this issue from FAIRiCUBE/catalog Mar 28, 2023
@cozzolinoac11
Copy link
Member

cozzolinoac11 commented Mar 28, 2023

Hi @pebau, I am adding values to the codelists and thinking whether it is better to use "Analytics" or "Analysis".
For me is better "Analysis".
What do you think?

@pebau
Copy link
Author

pebau commented Mar 28, 2023

@cozzolinoac11 thanks for asking - actually, I have a preference for analytics as a (today) common umbrella term for all sorts of getting insight out of the data; see also this discussion.

@cozzolinoac11
Copy link
Member

Hi @pebau I agree with you! I will update the codelists quickly.

@cozzolinoac11
Copy link
Member

Hi @pebau,
regarding your observation on "Conditions for access and use", it is a good point when we refer to the input/output data. In this form, however, the license asked for concerns the analysis/processing resource and not the input/output data.

@pebau
Copy link
Author

pebau commented Apr 3, 2023

@cozzolinoac11
apologies, not sure I understand:

  • input data might be what we ingested; they often come with conditions which we have to respect also when serving ourselves (eg, Copernicus)
  • but what is output data? what the users extract/derive? in this case I thought the conditions apply
  • "analysis resource" seems to be the data that has been ingested before? or the CPU?

@cozzolinoac11
Copy link
Member

@pebau
I fully agree with you regarding the licenses of the data (so, yes, no doubt we have to respect conditions they come with). In my post, I was just pointing you to the fact that this form (and this repository) is meant to collect information about the analysis/processing resources. The a/p resource (e.g. an algorithm that extracts statistics from a set of data) has its own license that is independent of the license of input/output data.

@pebau
Copy link
Author

pebau commented Apr 4, 2023

@cozzolinoac11
got it - thanks for clarifying!

@sMorrone sMorrone added the codelist request request for codelist update label May 2, 2023
@sMorrone sMorrone changed the title additional metadata values additional values (several codelists) May 2, 2023
@sMorrone sMorrone changed the title additional values (several codelists) additional values ( for several codelists) May 3, 2023
@KathiSchleidt
Copy link
Member

On Input Data vs. Output Data, my understanding was as follows:

  • Input Data: data the a/p resource was trained on
  • Output Data: not quite sure, assuming a dataset generated by this a/p resource

@cozzolinoac11
Copy link
Member

Input and output data depend on the type of resource:

  • for ML/DL resources:
    - Input Data: data the a/p resource was trained on.
    - Output Data: The model generated as a result of training on the input data, e.g., the link to an HDF5 file. While the field ‘Characteristics of output data’ is useful for understanding how to use the model, parameters, etc.
  • for other resources type:
    - Input Data: data used by the a/p resource.
    - Output Data: is the result of the algorithm applied to the input data and may be, for example, the value of an analysis or a new dataset obtained from a pre-processing or data-cleaning step.

@KathiSchleidt
Copy link
Member

Having different semantics behind the same fields worries me (reason I've long been questioning the Input and Output data concepts).

Also, following your logic, where would the data used by the a/p resource be provided for ML/DL models? Where would analysis results of ML/DL models be stored?

Also - where are potential updates to the a/p resource metadata being collected, is there a draft of the next version of D4.3 available somewhere?

@pebau
Copy link
Author

pebau commented May 4, 2023

can we maybe have an example for every situation in the catalog?

@cozzolinoac11
Copy link
Member

@pebau I just compiled the form with the two example pre-processing resources we included in D4.3.

An example of a deep learning resource is available at #6

@pebau
Copy link
Author

pebau commented May 9, 2023

@cozzolinoac11 thank you, that was an interesting read. So the result of a catalog query would be a Jupyter notebook demonstrating use of the processing element, that's what I understand.

If that is correct it is easy for us to likewise provide notebooks for rasdaman use, shown on examples from the FC datacube set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codelist request request for codelist update
Projects
None yet
Development

No branches or pull requests

4 participants