Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

$CYLC_WORKFLOW_LOG_DIR points to log/scheduler instead of just log #6494

Open
MetRonnie opened this issue Nov 25, 2024 · 9 comments
Open

$CYLC_WORKFLOW_LOG_DIR points to log/scheduler instead of just log #6494

MetRonnie opened this issue Nov 25, 2024 · 9 comments
Labels
question Flag this as a question for the next Cylc project meeting. small
Milestone

Comments

@MetRonnie
Copy link
Member

MetRonnie commented Nov 25, 2024

A quirk is that this points to

export CYLC_WORKFLOW_LOG_DIR="${CYLC_WORKFLOW_RUN_DIR}/log/scheduler"

rather than just ${$CYLC_WORKFLOW_RUN_DIR}/log.

From examination of the blame, this is due to it changing from

  1. CYLC_SUITE_LOG_DIR="${CYLC_SUITE_RUN_DIR}/log/suite" (Cylc 7)
  2. CYLC_WORKFLOW_LOG_DIR="${CYLC_WORKFLOW_RUN_DIR}/log/workflow" (Cylc 8 beta)
  3. CYLC_WORKFLOW_LOG_DIR="${CYLC_WORKFLOW_RUN_DIR}/log/scheduler" (Cylc 8)

Also, this env var is undocumented: cylc-doc issue opened for it: cylc/cylc-doc#779

Question: are we going to do anything about this? Edit: main proposal is to rename it to CYLC_SCHEDULER_LOG_DIR and make CYLC_WORKFLOW_LOG_DIR deprecated with a warning, and add/update a cylc lint rule for this.

@MetRonnie MetRonnie added small question Flag this as a question for the next Cylc project meeting. labels Nov 25, 2024
@MetRonnie MetRonnie added this to the 8.x milestone Nov 25, 2024
@hjoliver
Copy link
Member

Historically, I suspect this reflects the distinction between "workflow log" (i.e. scheduler log) vs "job logs". But I can't think why jobs should need to know the scheduler log location.

@MetRonnie
Copy link
Member Author

Changing it to point at the main log directory could break some workflows unfortunately (though it wouldn't hurt to do a grep on site to get some numbers). So maybe we could just rename it to CYLC_SCHEDULER_LOG_DIR and make CYLC_WORKFLOW_LOG_DIR deprecated with a warning, add a cylc lint rule etc.

@oliver-sanders
Copy link
Member

oliver-sanders commented Nov 26, 2024

I don't see this as a problem/quirk the "workflow log" environment variable points at the directory where we store the "workflow's log files", seems legit right? The distinction between job logs and workflow logs seems reasonable.

If you would like an env var pointing at the root log dir, that could be done, but what for?

@MetRonnie
Copy link
Member Author

MetRonnie commented Nov 26, 2024

I think most users would expect CYLC_WORKFLOW_LOG_DIR to be CYLC_WORKFLOW_ID/log, which would be consistent with CYLC_WORKFLOW_SHARE_DIR and CYLC_WORKFLOW_WORK_DIR. This tripped up both Kerry and me yesterday

Now, we don't necessarily need one that points to the root log dir (although I guess you could argue the same for work or share), but I think the one that points to the scheduler log dir should be named as such.

@oliver-sanders
Copy link
Member

oliver-sanders commented Nov 26, 2024

We can spin that both ways, I think most users would expect to find the workflow logs in the workflow log dir?

Either way, probs not concerning enough to change?

@hjoliver
Copy link
Member

hjoliver commented Nov 26, 2024

We could resolve the ambiguity like this:

  • CYLC_WORKLFOW_LOG_DIR = $CYLC_WORKFLOW_ID/log
  • CYLC_SCHEDULER_LOG_DIR = $CYLC_WORKFLOW_ID/log/scheduler
    • and/or CYLC_WORKFLOW_SCHEDULER_LOG_DIR = ditto
  • CYLC_WORKFLOW_JOB_LOG_DIR = $CYLC_WORKFLOW_ID/log/job

Rationale: the "workflow" has a scheduler log and job logs.

Or (simpler) only:

  • CYLC_WORKLFOW_LOG_DIR = $CYLC_WORKFLOW_ID/log

Rationale: as above, but let users add "job" or "scheduler" as needed.

I'd be in favour of doing this. If we leave small ambiguities and inconsistencies as-is they eventually accumulate and make a bad smell.

Of course we have to trade off with the damage caused by breaking changes. But in this case I highly doubt anyone is using WORKFLOW_LOG_DIR as it currently is, in job environments. ... especially as, as @MetRonnie noted above, it's not even documented.

@hjoliver
Copy link
Member

(Note the variable appears 23 times in our tests.)

@MetRonnie
Copy link
Member Author

Got a grep on site running for a while, still no results, may leave it overnight

@MetRonnie
Copy link
Member Author

MetRonnie commented Nov 28, 2024

I've found 1 case:

# app/housekeeping/rose-app.conf
[env]
LOG_DIR=${CYLC_WORKFLOW_LOG_DIR}

And guess what:

# app/housekeeping/bin/housekeeping_logs.sh
LOG_DIR=$(dirname ${LOG_DIR})

So they're accessing the root log dir anyway 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Flag this as a question for the next Cylc project meeting. small
Projects
None yet
Development

No branches or pull requests

3 participants