Boinc Wrapper unzips inputfiles again after resume from checkpoint? #5933
-
Hi, I was testing the project-specific corrections for GPUGrid ATMML tasks checkpoint/resume, and noticed a delay of several minutes before the main task resumed processing. After careful monitoring of the cause, I noticed that following steps were repeated on every resume: I also checked the wrapper.cpp sourcecode and notice that do_unzip_inputs() is called unconditionally just before checking if we are resuming from a checkpoint. Question: is this behaviour intentional and if yes, what is the reasoning behind it? From my perspective, since inputs are copied/unzipped on the first run, and the slot directory isn't cleaned afaik on pause/resume of a task, it is not only inefficient but also potentially harmful to redo those steps - it might alter the task's files to a point where a resume fails. Regards, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
@davidpanderson, could you please take a look at this? |
Beta Was this translation helpful? Give feedback.
-
The client copies input files to the slot directory every time the job starts, not just the first time. The wrapper unzips input files, then it deletes the .zip file. Possible solutions:
I'd lean toward 1 + 3. Does that sound OK? |
Beta Was this translation helpful? Give feedback.
The client copies input files to the slot directory every time the job starts, not just the first time.
It does the copy only if the file is not present in the slot dir.
The wrapper unzips input files, then it deletes the .zip file.
As a result, if the job is stopped and restarted, the .zip file gets copied and unzipped again.
Possible solutions:
I'd lean toward 1 + 3.
That way things will work correctly with both
old client + new wrapper and new client + old wrapper
Does that sound OK?