-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💡 Discussion on vocabulary for data #45
Comments
Hi Micha, while I accept that selectivity is a property that every catalyst has and that can be uniquely attributed to a catalyst, using selectivity in the same sense as weight is quite difficult. We might forgo that the weight must be measured at a certain time since it is mostly unimportant whether a person is around 100kg today and was 99,8kg yesterday, it is still around 100kg. We can not do so with the selectivity. Selectivity is a nonunary characteristic. It always depends on the given educts and the respective product to which it refers. So "selectivity(of a product for a given educt)" Measuring the Selectivity of a deNOx catalyst like a car catalyst also depends on the fuel we use, not only whether we measure NO2 or N2O2. And even if we can say this, the selectivity is still dependent on the whole set of reaction conditions (e.g. Cold start selectivity vs set point selectivity). If you are familiar with I-adopt this picture, where I tried to model the data content of https://nfdirepo.fokus.fraunhofer.de/dataset.xhtml?persistentId=doi:10.82207/FK2/NR5BWO via I-adopt, might help you understand the complexity of modeling a selectivity measurement in a simple data set. Maybe it is best not to focus on a highly composite term like selectivity (it is basically not one but multiple parameters combined to be more easily comparable) but focus more on more straightforward terms like reaction enthalpy or even decompose a term like selectivity into more manageable terms like conversion and yield. But I would also like to hear your suggestion as to how you either want to handle the complexity of a selectivity or examples in which you want to use the term selectivity without struggling with such things. all the best |
Hey Hendrik, thanks for your answer :). I think we need to boil it down to the point where it is like a mass of a person. The data point is the output of a measurement and therefore the context like time, reactants, temperature profile etc is given. If we have a term like Lets be concrete, since you asked for an example, i will use highlighted notation to indicate triple stores: This vocabulary could be used quite nicely to annotate data in nomad. I still find it quite helpful to separate the qualities like mass from the data. Because then you can think about of what mass is as a quality and then at another point you can think about where the data comes out, eg a measurement. Obviously one could use one term for data and quality, but then i always wonder, are you referring to selectivity as a property of an reaction or are you referring to a numerical value of a specific measurement. What do you think, best Micha |
Hi Hendrik and all, The 2nd point of discussion, how we define selectivity, is seperate from the original post, I think. Maybe we should open a seperate post about this? I dont think it is an option to not define selectivity in a catalysis vocabulary. The currently existing definition for selectivity is not general enough and does not capture what I need to annotate the data in my datasets: "In photocatalytic processes where product formation is expected, selectivity describes the ability of a photocatalyst to produce only the desired product with minimal (or none) of byproducts." The definition I suggest "A property of a product that refers to the ratio of products obtained from given reactants." is more data focused. The definitions could be extended by the formula "S(product)=Amount of product/ Sum over all products" and include the ambiguity that sometimes products are assigned special weight factors depending on the number of reactant molecule that goes into each product. |
I agree to separate the discussion on the definition for selectivity (to be continued in #46) from how to model data (to be discussed here). @schumannj - Do you want to copy/move your 2nd point to the new issue? |
Yes, thanks David, I copied my last paragraph into the new issue #46. |
Hi @schumannj and @RoteKekse and all others, sorry if I misunderstood the With regards to @schumannj, I think we should define all those measurement datum, as the class "conversion" for example might just be used in a Thesauri sense while the class "conversion measurement datum" is the only one allowed to inherit Individuals which link to the data of the measurement. The question for me is more where the pure application level begins and where our direct ontology ends. I recommend reading this design pattern and the corresponding example, maybe that will help. As you can see in the links, the 4Chem data would just be modeled. regarding the difficulties in finding a definition for selectivity, I agree. |
Even after the split there may still be two discussion topics here:
Regarding the latter:
On "datum" in the previous messages: You seem to talk about something different than a location "a datum is a reference point, surface, or axis on an object against which measurements are made" (https://en.wikipedia.org/wiki/Datum_reference). What do you mean with datum? |
In the OBI and IAO, to which @RoteKekse and @schumannj or at least I am referring, a datum is just the singular of a data item.
|
Regarding the question whether we need to further split the discussion, I think splitting is not necessary as I think both question can not be answered mutually exclusive. If we want to provide an OBI/IAO-esque approach to providing data we are "stuck" with the general terms and ideas provided via both Ontologies. If we do not want to align your vocabulary (at least too much) to those ontologies we might be able to answer both questions separately. |
Hey Hendrik, The same is true with efficiency of a solarcell. It is calculated from a measurement but a quality of a solar cell. I would be happy if we could include these distinctions in voc4cat :). I find the initiative very helpful and was happy to learn it last week. I think this can be a big step towards interoperability in catalysis! Me and Julia could try out to annotate data then in nomad to test the vocabulary in an ELN setting. best Micha |
I agree that selectivity is most often an output of a data transformation. However, you can get devices that output selectivity data directly and do the transformation internally. So selectivity is not fundamentally different then other measures. Temperature is also the result of a transformation, e.g. from voltage to Kelvin. The difference is that in one case you trust the reading being a result of an "external" transformation and in the other case you include the transformation in the data model. Both cases are relevant. |
Hey David, you are correct, for us it is important to know where to draw a line for now (we can extend it if needed). We do not need to describe reality as its fullest (which is any how not possible :D) and we need to compromise somewhere. For catalysis research i think it is fine group temperature as a measurement (might not be if you are a control engineer) for the selectivity that might be different. But if not we can just treat selectivity as the measurement datum for now and if needed we can refine later. This process is never finished and we have always the option to refine if needed. |
Yes, be pragmatic first. ...but also be aware of the simplifications you make. I am looking forward to hear how well the nomad ELN integration goes. |
@RoteKekse will you closely follow OBO-style modelling? I wonder because you were referring to their ontologies mainly. @HendrikBorgelt is leaning towards I-ADOPT and DCAT. I-ADOPT should interface well with voc4cat as terminology. It would be great to have an exchange about how well both approaches work. Maybe at or soon after Katalytiker-Tagung? |
Regarding the modelling of selectivity, which is a ratio of two or more measurements, you may find the example for modelling body-mass index (BMI) in OBI interesting. BMI is similar in the way that it is also derived from two measurements. |
Hey David, I read the book on bfo and OBI is based on BFO. I felt there exists all I need. And with Obi and chmo there are already lots of terms which are relevant to us. yes the example of BMI is exactly how I think about these things. But I have never explored different approaches. |
@dalito, regarding I-ADOPT and DCAT, i am currently just trying to get something in between a data format with is unstructured but uses a defined vocabulary such as "Voc4Cat" and "an" Ontology (in Catalysis I think we mostly have to deal with a plethora of Ontologies combine into the "local" Ontologie), which I think is too difficult for most Scientists and even Data Stewards to handle. So my focus is more on, how can we use a "Vocabulary" such as DCAT to give Scientists as well as Datastewards, an easy model after which they can structure their data. (something like a conceptual model which might not be perfectly depicted in this picture https://www.researchgate.net/publication/234793155/figure/fig1/AS:595782646378496@1519057060120/More-Expressive-Semantic-Models-Enable-More-Complex-Applications.png ) |
@RoteKekse I think the OBI/IAO approach is one of the most flexible and also expressive modeling descriptions one can have, it is just quite large and therefore "unwieldy" compared to an approach such as |
Description
Dear NFDI4Cat,
I am working at HZB for Catalysis and I am using Nomad as a software/ELN. We use nomad to describe catalytic data.
I met David Linke Last Wednesday and I would like to discuss if it is worth adding terms describing the data.
In an Ontology there is usually the distinction between a
specific dependent quality
for exampleselectivity (of a product)
and a data point for example50%
. Selectivity is a property a catalyst always has when probed in a specific catalytic experiment with specific conditions. Data also usually as the property of a unit or dimensionless.The data point(s) could be the output of a measurement, they could be the results of a simulation or of a calculation. And there can be multiple of those. E.g. the experiment could be conducted under the same conditions multiple times on different days.+
For selectivity one could argue that any catalyst has a selectivity for any possible product (although it might be 0 quite a lot). And the fact that it has that does not depend on any data it is just something it has.
Data then reveals how this quality could be quantified in a specific situation.
This distinction is also present in the ontology work such as in the OBI.
I find this distinction quite helpful since it separates the data from the quality. A more practical distinction would be the mass of a human. Any human has a mass. The fact that they have a mass is fixed and a quality of every human. That mass then can be measured and data is produced, could be 80kg, 100000g or others.
For me as a data steward this makes life a lot easier and i applied this already in the context of solar cells.
What do you think :)?
@schumannj, @dalito @HendrikBorgelt @AleSteB
All the best
Micha
The text was updated successfully, but these errors were encountered: