Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using SlurmExecutor on RHEL8 compute nodes #42

Open
svandenhaute opened this issue Nov 15, 2022 · 1 comment
Open

Error when using SlurmExecutor on RHEL8 compute nodes #42

svandenhaute opened this issue Nov 15, 2022 · 1 comment

Comments

@svandenhaute
Copy link

svandenhaute commented Nov 15, 2022

Environment

  • Covalent version: 0.177.0
  • Covalent-Slurm plugin version: 0.7.0
  • Python version: 3.10
  • Operating system: PopOS 22.04 LTS

What is happening?

asyncssh seems to have trouble sending the command to create a directory on the compute node. I don't know exactly what's going on, but based on this article I'd conclude that some HPCs do not like a login shell due to a legacy command mesg n in /etc/profile.

How can we reproduce the issue?

import covalent as ct
import numpy as np


@ct.electron(executor='local')
def sum_(n):
    return np.sum(np.arange(n))

@ct.electron(executor='local')
def product_(n):
    return np.prod(np.arange(n)[1:])


def get_sum_product(n):
    return sum_(n) + product_(n)


if __name__ == '__main__':
    workflow = ct.lattice(get_sum_product, executor='slurm')
    dispatch_id = ct.dispatch(workflow)(10)

What should happen?

[2022-11-15 13:22:54,295] [ERROR] execution.py: Line 364 in _run_task: Exception occurred when running task 4: mesg: ttyname failed: Inappropriate ioctl for device
[2022-11-15 13:22:54,297] [ERROR] execution.py: Line 372 in _run_task: Run task exception
Traceback (most recent call last):
  File "/home/sandervandenhaute/envs/covalent_env/pyenv/lib/python3.10/site-packages/covalent_dispatcher/_core/execution.py", line 345, in _run_task
    output, stdout, stderr = await execute_callable()
  File "/home/sandervandenhaute/envs/covalent_env/pyenv/lib/python3.10/site-packages/covalent/executor/base.py", line 572, in execute
    result = await self.run(function, args, kwargs, task_metadata)
  File "/home/sandervandenhaute/envs/covalent_env/pyenv/lib/python3.10/site-packages/covalent_slurm_plugin/slurm.py", line 399, in run
    raise RuntimeError(client_err)
RuntimeError: mesg: ttyname failed: Inappropriate ioctl for device

Any suggestions?

Adding request_pty='force' to the conn.run() call seems to fix the issue, although the message is still displayed in the log. Replacing mesg n with tty -s && mesg n as suggested elsewhere is only possible with root access, which will not always be the case.

@wjcunningham7
Copy link
Member

Hi @svandenhaute thanks so much for this feedback and suggestion. We'll take a look into this and see if we can reproduce the issue.

CC: @AlejandroEsquivel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants