Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSCode Quarto extension interactive window can't start a process-based dask LocalCluster/Client #624

Open
dbdean opened this issue Dec 19, 2024 · 0 comments

Comments

@dbdean
Copy link

dbdean commented Dec 19, 2024

I need to run the dask client/local-cluster with processes rather than threads, because it seems to do a much better job of memory management that way, and long running jobs can finish without running out of memory, but if I create the following qmd file:

```{python}
from dask.distributed import Client
```

Creating a cluster with processes=False:

```{python}
client = Client(processes=False)
display(client)
```

Creating a cluster with processes=True:

```{python}
client = Client(processes=True)
display(client)
```

Then when I step through the cells with [Shift]+[Enter], the first example (using threads) runs ok in the interactive window, but the second example (using processes) prints out a massive set of repeating warnings and errors before eventually failing with RuntimeError: Nanny failed to start.

At the very top of these errors it says (many many many times):

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.12/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
  File "/usr/local/lib/python3.12/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/local/lib/python3.12/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 286, in run_path
  File "<frozen runpy>", line 259, in _get_code_from_file
  File "/workspaces/dask_with_qmd_example/minimal-cluster-example.qmd", line 1
    ```{python}
    ^
SyntaxError: invalid syntax
...

Suggesting that it is trying to read the qmd file in as a python file and failing (understandably).

It works file when rendering the document using quarto render minimal-cluster-example.qmd --to html

It also works fine with similar code with a bare python file at the command line, and when executing that file line by line in a VSCode REPL or Jupyter Interactive Window. You do need to put if __name__ == "__main": around the script when running at the command line though.

It also works fine with a similar set of code in a Jupyter Notebook running interactively inside of VSCode.

If you break up the code into creating a LocalCluster and then linking the Client, it is linking the client that causes the problem, not creating the cluster. It looks a lot like something is trying to import the local python module, and failing because it is getting a quarto file instead of a python file (which is similar to the reason you need the if __name__ == "__main__": block in the bare python I think.

Not being able to run this interactively is making it a lot harder to build up to the final analysis that I can run in one go with quarto render.

Thanks for your help and/or any assistance or workarounds that might be possible.

Reproduction Information

VSCode version 1.96.1 running on Windows 10 Enterprise. Connected to a dev container with the following configuration (using Podman 5.3.1):

{
	"name": "Python 3",

	"image": "mcr.microsoft.com/devcontainers/python:1-3.12-bullseye",
	"features": {
		"ghcr.io/rocker-org/devcontainer-features/quarto-cli:1": {}
	},

	"postCreateCommand": "pip3 install --user -r requirements.txt",

	// https://medium.com/@guillem.riera/making-visual-studio-code-devcontainer-work-properly-on-rootless-podman-8d9ddc368b30
	"runArgs": [
		"--userns=keep-id:uid=1000,gid=1000", 
	   ],
	"containerUser": "vscode",
	"updateRemoteUserUID": true,

	"customizations": {
		"vscode": {
			"extensions": [
				"ms-toolsai.jupyter"
			]
		}
	},
}

requirements.txt:

dask[distributed]
nbformat
nbclient
jupyter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant