Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New SHARE and WORK variables pointing to the symlink'd location? #6488

Open
ColemanTom opened this issue Nov 22, 2024 · 5 comments
Open

New SHARE and WORK variables pointing to the symlink'd location? #6488

ColemanTom opened this issue Nov 22, 2024 · 5 comments
Labels
question Flag this as a question for the next Cylc project meeting.
Milestone

Comments

@ColemanTom
Copy link
Contributor

Problem

In the global.cylc, we have our share and work directories on lustre, which are symlinked into our workflows area. Typically, we access those locations using things like CYLC_TASK_WORK_DIR and ROSE_DATAC - but they go through the symoblic link. Under low traffic, that is fine, but, we have shown under high traffic this can cause problems and it is less load on metadata servers to use the real path. Someone explained it to me a while ago, but I've forgotten what goes on each time a symbolic link is traversed.

Proposed Solution

When a workflow is installed, it obviously knows that real path as it sets the symlinks up. I would like those variables made available directly in a job script so we can avoid traversing the symbolic links when scripting. I think the symlinks are useful for navigating easily in interactive mode, but from an efficiency perspective, not using them inside a script is better.

@oliver-sanders
Copy link
Member

oliver-sanders commented Nov 25, 2024

Suggest using realpath to fix these variables in the workflow definition like so:

[runtime]
    [[root]]
        [[[environment]]]]
            CYLC_TASK_WORK_DIR = $(realpath "$CYLC_TASK_WORK_DIR")

This will ensure that all uses of the env var will pick up the resolved location if that is what you want.

@oliver-sanders oliver-sanders added the question Flag this as a question for the next Cylc project meeting. label Nov 25, 2024
@ColemanTom
Copy link
Contributor Author

Suggest using realpath to fix these variables in the workflow definition like so:

[runtime]
    [[root]]
        [[[environment]]]]
            CYLC_TASK_WORK_DIR = $(realpath "$CYLC_TASK_WORK_DIR")

This will ensure that all uses of the env var will pick up the resolved location if that is what you want.

I know that is doable, but, that still adds some I/O load to the system via realpath. It is very simple to do via the Cylc side without adding any extra load to the remote platofrms. e.g. this MR. Looking at that old one, I'm suggesting what I did there, without the directory creation changes.

@hjoliver
Copy link
Member

Seems reasonable to me. E.g. CYLC_TASK_WORKDIR_REAL. @oliver-sanders - are you aware of any downside to doing this?

@oliver-sanders
Copy link
Member

oliver-sanders commented Dec 3, 2024

The downsides are:

  • Site specific, not clear if this will be used beyond BoM.
  • More environment variables. We already have a lot, adding realpath variants of these will bloat environment and make things more confusing for users (which variant should I use and why?).
  • Scripts may use the non-realpath variants of env vars (hard to enforce, difficult to manage with site-portability), whereas overwriting the env var always works so is a technically superior solution.

If there is no downside to resolving this symlink (requires thought), then we could consider making this the default.

Alternatively, if your site is keen to avoid symlink resolving, then perhaps we could consider a per-platform config to flatten the existing env vars?

@dpmatthews
Copy link
Contributor

I don't think there is any way to safely avoid using realpath in which case I think Oliver's original suggestion makes sense rather than adding more variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Flag this as a question for the next Cylc project meeting.
Projects
None yet
Development

No branches or pull requests

4 participants