-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow blocks with partially resolved parentage information #94
Comments
@todor-ivanov , thanks for creating the issue. But I think neither solution is acceptable. For the solution 1 I do not know how it will impact other use-cases. For the solution 2 I do not know how to correctly do that in ORACLE DB, what is a default and why we should have it in file parent table. I think you are looking for solution 3:
How does it sound to you? @amaltaro please provide your input too. |
Thanks @vkuznet, @amaltaro, what is your opinion on this? |
@todor-ivanov , please provide additional information which DBS API is affected by this request and provide an example of HTTP POST input payload. For DBS API please inspect the corresponding python client to identify which DBS API it is using, here is DBS API documentation which lists all DBS APIs. |
Valentin, Todor, this discussion is in agreement with what we discussed yesterday. Just to summarize:
For the record, the only - expected - users for this API is WMCore, especifically that ReqMgr2 cherrypy thread. |
I can work on this if @klannon approve my time allocation for this task, or we can delegate this task to @d-ylee . Meanwhile here I provide list of action items which should be done within DBS codebase:
|
@d-ylee Hi Dennis, did you manage to check the feasibility of the developments listed above? Please let us know if further clarification is required here. |
@amaltaro I think the discussion above is clear to me. I can work on this after I get the updating physics group name completed. Thanks! |
@vkuznet After looking through the code, I noticed that for the database schema: dbs2go/static/schema/DDL/create-oracle-schema.sql Lines 866 to 870 in 928dc25
|
Dennis, it is good question and in fact should be answered by @amaltaro , @todor-ivanov . For instance:
So, the two approaches will reflect how we'll query the data and how we will interpret the results of such query. For example, if we stick to approach 1, then our I suggest that @amaltaro , and @todor-ivanov should evaluate both pros and cons and side effects of this decision. |
From the DBS perspective, would there be any impact and/or performance degradation if we allow When using the which returns the following data
do I understand it right that, in case 1) above, we would get an output data as (just an example):
while in case 2) the output response would be an empty list. Can you someone please confirm this behaviour? |
Having or not parent file id in |
Hi @vkuznet @d-ylee @amaltaro, To me option |
@vkuznet @amaltaro @todor-ivanov If we go with
For the From a past message:
Would we need to have a entry in the FILES table that is for no parent files, but when getting from Also, would there be cases in which allowing null parentage for |
Dennis, everything can be much simpler than you describe. How about using -1 for PARENT_FILE_ID when it is not provided? This way, we do not need to adjust schema, the -1 is acceptable integer and we may use it in other tables and APIs to indicate of no parent presence. |
The negative value for missing parent file id is a valid workaround, which would prevent the need of table redefinition. But we need to be sure what is about to be returned to the client as well. Because, once recorded as Denis, can you list the APIs in DBS which would develop such a silent data type metamorphosis if we opt for this workaround, so that we can check where and when WMCore uses them? FYI @amaltaro |
Not sure if this is what you are asking for, but the APIs that are related to parentage are these:
|
Coming back to this, it does not seem that all those APIs you list in your previous message would be affected by
So to me, it seems only the APIs which return information on the So in the most straightforward use case, this might affect us is in the client's methods calls to So this is the place I think we will have to rethink this mechanism again, but I see no other way to test it other than just making it in testbed .... and check what the server returns and how we aggregate the parentage information. To me this again would be yet another workaround, which we will have to remember forever. |
Just an update of this issue as an outcome of the Zoom discussion we just had (Valentin, Dennis and myself), this is the list of action items that we came up with:
|
@d-ylee Hi Dennis, I wonder if you managed to look into the |
After some basic testing,
In this example, the value of This behavior is reflected in the documentation: https://pkg.go.dev/encoding/json#Unmarshal
Null values in JSON are given the I can do additional testing by creating a simple HTTP server and sending a Python request with a |
@d-ylee thank you for these details! I think it is good enough and I understand that WMCore can keep providing a tuple of Please let us know how your communication with the CERN DBA goes and where you need support on anything on this front. |
@d-ylee thanks for confirmation. PLEASE keep in mind that
So, on a wire, i.e. when HTTP request is send, the python code MUST provide @amaltaro , @todor-ivanov you must ensure that whenever Python code provides |
@vkuznet yes, the python object is properly encoded as a json object. So no problems on this front are expected. |
I did a request from Python in the same way as what @vkuznet detailed using |
@d-ylee Hi Dennis, it's been 2 weeks since we had our last chat, and I believe we only managed to cover points 1 and 2. from this comment so far: do you have any updates on the other points? Do you know have a DBS dump such that we can try things in a dev environment? In addition to that, can you please share what the current constraints are for updating block/file parentage data? It might be that Valentin already provided it in one of the tickets we discussed this, but I could not find it. It's important to mention that this issue is currently keeping data unnecessarily unlocked in production, so we better fix it sooner than later. |
hi @d-ylee would you want us to organize another meeting if you need some more information from our side so we can proceed on this? If you think there is some step that we first need to do and is currently blocking the progress, please let us know? |
I spoke to @yuyiguo about this issue, and she explained to me some of the policies from the past and about this partial parentage issue. Before DBS3, partial blocks were allowed, and these were able to be inserted into DBS when there were missing blocks. This was because we may have not had all files or we did not have the parent-child relationships yet. This was an issue because we might forget about these partial blocks and end up not knowing who is the parent. Yuyi mentioned that the policy in DBS3 was to only allow full blocks. Workflow management was to have its own database to keep unfinished blocks there, then insert into the production database. If we want to change this policy, we need to have it written down and record who is responsible to complete the block. Another concern is when a child has multiple parents. The current schema in the BLOCK_PARENTS database is the PARENT_BLOCK_ID and THIS_BLOCK_ID (the child). These two columns have a unique constraint. This means that there can't be more than one Another question I have is, do you have the location of the written DBS3 design policies? What are your thoughts? |
@amaltaro , good design practices suggests to separate APIs for different use-cases rather to introduce complicated if/else logic. The later was used in Python server code which (if you had a chance to look at it) is very hard to read and understand. Therefore, I expressed my opinion based on this argument. Plus, you never know that one client will only the case. I can easily foresee when we may have other clients, plain script (instead of python code), or CRAB, or any new development we may have in a future. And, even using single client as it is right now you really address different use-cases. For maintenance reason it is better to address them via separate API rather complicate server code (even if we'll need make additional code-refactoring and wait a little bit longer). |
to me this is exactly the opposite of what you are suggesting. Creating another API just to deal with a slightly different use case IS an over-complication, it is confusing and it causes mistakes on the client side. The extra query string would be the best approach in my opinion, but that will require changes in:
The approach I am suggesting adds NO overcomplication and does not make the code more complicated, does not increase maintenance and is transparent to the end user. It also does not require the release of a new dbs3-client!
to something like
Anyways, I think we will spend the rest of our days disagreeing on this! |
Alan, I would like to take last change to convince you and the team to not add changes to the existing API. My last argument is the following:
In my suggestion this is not possible by design since client will be forced to use either new API or provide external parameter to the API in question. Doing this way client a-priory will do a right job and DBS server will not rely on if/else logic of the input data. You may provide as many arguments as you want that such server implementation is more easy to do but you open a possibility that one day client will (by whatever mistake) provide incorrect input and DBS server will follow if/else logic to insert the data to the database. For instance, someone by mistake will not provide parents, or we may have a bug elsewhere in WMCore which may skip some parent, etc. In either case DBS server will follow the input if we will use the same API signature. To avoid this possibility we must enforce client to be explicit with the call (in other words client will know what it is doing by calling proper API or API with proper parameter rather providing an input only to DBS server). I hope that you can take this arguments more seriously and properly evaluate the implication of implantation on DBS, but I'll not argue any more about how to implement this logic and leave it up to your and the rest of WMCore team to decide (here I have dual role, I belong to WMCore team and in my view the implementation should have separate API or API+external parameter and I implemented DBS logic). I think Dennis should implement whatever WMCore will agree. Said that, I tag @todor-ivanov to evaluate my concern and you both should come to conclusion how the implementation should be done. Once you'll reach such agreement I think for Dennis it will be more clear how to implement it. At the end, it is just implementation but my concern that the choice may impact data consistency in DBS database in a long run. |
hi @amaltaro @vkuznet @d-ylee, One really important thing to take into account when making such a decision is the knowledge if our current policy, for not inserting files into DBS which lack parentage information, is acknowledged and well understood by every other group that may be affected. And here I mean both sides:
We should not talk only about WMCore and people responsible for DBS server. And this, I believe, is information we do not have at the moment. Obviously there is no policy document for DBS Server behavior..., ence our blindness on the applicability of this policy. There could easily be some proliferation of the reverse opinion among other groups who rely on the data returned by DBS - meaning they may rely on the parent-child relationship in DBS to be satisfied in any case. So we are actually in the state of lack of consensus on the policy itself. Now about the possible solutions: The fix only on the server side is something that is less work and is an encapsulated change, but because of the above unknowns, I'd bet for the harder one - the one that forces the client to explicitly and consciously consciously stress which policy he would chose for uploading the files. It is not only backwards compatibility that we are talking about here, but also about bearing the responsibility for doing the one or the other. At the end we will not change the API, regardless of which solution we chose here - !!! we only change Server behavior !!!
Yes this is going to affect/require changes from everybody, as @amaltaro said:
In the case of WMCore this should be fairly small change - adding the new option here and here The tests and validation procedure, on the other hand, would stay as usually long and need to be a full one.... well it has its price. Something more. Since this situation (or what concerns WMCore) is an expected one, and our use cases are well identified in advance - as early as building the lists for the parentage resolution itself, I'd not further complicate the code by implementing extra calls to DBS for checking dbs errors and retries. This would only affect code readability and scalability both together without any benefit. So I'd go directly for just changing the policy on the WMCore side by using the newly provided option to the same API. |
I just back from my vacation and saw this discussion. Right before @d-ylee and I left for vacation, @amaltaro, Dennis, and I had a Zoom meeting on this. We discussed why the "-1 " would not work with the DBS DB schema. Alan pointed out that the missing parent files would be lost forever, in other words, the file parentage would not be updated in the future. So we concluded that it is better to just not put anything in DBS for the ones that lost their parents' files.
|
@amaltaro @todor-ivanov If this is what we would like to do, I think we would need to incorporate the number of missing files to the check here: Line 352 in 928dc25
I do see that the Another question I have is that would DBS need to keep track of these missing files? Or would this be on WMCore's side? Both this question and the above validation seem to affect each other. Thoughts? |
@d-ylee Hi Dennis, Yuyi's idea looks good to me and having the client to provide how many files are missing in the json data is a good approach (defaulting it to 0, if not provided). On what concerns keeping track of the missing files. I would say that the DBS server should at least log that a block with incomplete parentage action was performed (sort of an access log). On the WM side, we do log that a given file failed to find its parentage file. Please let us know if there are any remaining open questions. |
@d-ylee Dennis, in addition to what we have discussed today, I just wanted to update this thread with a way to migrate dataset from prod/global do your dev cluster (such that we can test things out):
I looked into the dbs3-client code and came up with the following
but all my attempts failed with:
and I cannot figure why. Perhaps @vkuznet has a working example with curl other than this: |
Alan, could you please read closely documentation. It says:
while in your request your URL does not specify |
Moreover, each DBS server has its own set of APIs which you can easily find via
Similar |
@vkuznet Valentin, you probably missed a few important points:
The |
Alan, I understand what you are doing. The provided dbs2go documentation does not list our production/test/dev URLS and refers to actual service. Of course I would expect that you know your server endpoint, e.g.
Then curl pod is created in default namespace and you can login to it and use curl to access your services. |
Okay, it looks like Dennis has some documentation work to be done at some point, such that a developer can go as closer to the production environment as possible, to have more meaningful tests. I didn't remember we had a manifest only for curl, that will definitely be handy. Thanks. @d-ylee Dennis, as we discussed yesterday, I finally got many json dumps for a StepChain workflow that you could play with. Note that they only provide dataset parent information, no file and or block parentage. You can find them here: Their parentage relationship is:
Please let me know if this enables you to move forward with this development and testing. |
@amaltaro Thanks for the json dump. @vkuznet I was going through the fileparent insert API and noticed that we are validating the incoming json twice: Line 85 in e71c8a4
Line 368 in e71c8a4
Was there a reason for this? To implement the Lines 209 to 222 in e71c8a4
|
Why do you think we are validating twice? Those are two different APIs, the former one is
record. The corresponding json will looks like this:
While latter is from
record. The corresponding JSON record will look like this:
Both records comes in different DBS requests (you may need to figure out which ones). Therefore both require their own validations. Regarding validation procedure. The |
…d issue with parentage check Kept validation check Partial Parentage is indicated by providing -1 to ChildParentIDList Only insert valid parentage
@amaltaro and I decided to use -1 in order to keep the |
@vkuznet I made a PR and would like to deploy it to the CERN image registry. My plan is to use a conditional on the tag (adding |
@d-ylee Dennis, provided that you can authenticate, yes, you should be able because I have done that for some WMCore images. |
@d-ylee I second Alan reply. Yes, you can upload new image, and yes it would be useful to have |
@vkuznet I am having difficulty making the image based off the commands in the GitHub Instead of merging, should I pull it into this repository as a new branch, then tag it to |
@d-ylee why do you need CI/CD to build your custom image. Instead do the following:
|
@amaltaro has tested the changes on test3, and we have agreed to move it to preprod for further testing. |
I have pushed the changes to dbs2go-global-w and dbs2go-phys03-w on testbed. |
Awesome, thanks Dennis! I hope to be able to upgrade testbed ReqMgr2 later today, otherwise it will happen tomorrow afternoon. BTW, should we close this ticket with #101 ? |
@d-ylee kind ping (please see above) |
Closing due to changes deployed to production. |
Impact of the new feature
There could be many groups affected by such a change.
Is your feature request related to a problem?
This is a request for a feature change which is related to a recently found custom use case involving miscommunication between WMCore and DBS. Here is the related WMCore issue and description of the situation: dmwm/WMCore#11260 (comment)
Long story short, we need to allow uploading blocks with partially resolved parentage information. This means few of the files present in a block may have no relation to files from the parent dataset (missing or partially resolved parentage information).
The main question here is eventually to Core SW and Physics groups as well - How mandatory is this constraint and How badly would such a change affect other steps?
Describe the solution you'd like
#1
: We simply remove the check for fully resolved parentage information for all files in a given block, here: dbs/fileparents.go#L344-L358#2
: We allow files from the child block to refer to aNO
parent file by setting the parent file id to a predefineddefault
value (in accordance with the DBS data structures), which should be an indication that these files are missing the upper level files which they have been produced from.Describe alternatives you've considered
The alternative approach wold be for WMCore to start developing a mechanism for automatically invalidating all those files from DBS and Rucio, which will go silently in the background. So far we have no numbers to cite about how frequent this is and that's why we cannot tell if and to what extent of data reduction it could lead. But in any ways, we are not in favor of this alternative path.
Additional context
None
The text was updated successfully, but these errors were encountered: