Performance Improvements using Worker Threads #15813

DonJayamanne · 2021-03-30T17:02:45Z

I've been looking into a performance issue related to opening notebook.
TLDR - Opening a notebook in widows for Chris Dias and me takes 20-30 seconds (Jupyter extension is not doing anything).
If we disable the Python extension, it opens almost immediately (~5 seconds).

I think the delay is in Python extension spawning n processes. I know we're making changes to environment discovery.
However I think we can improve this even further (or, even with the current changes, it might not be sufficient).

I believe the problem lies with the spawning of processes, if we spawn proceses, it takes a few ms to just spawn the process, e.g. i know on Windows, spawning a conda process takes a few seconds, & that's blocking (spawning processes in node is synchronous) - but communication is async.

This blocks the main thread, hence blocks all other extensions (kinda explains why it takes so many seconds to just open a notebook when Jupyter isn't doing anything).

Suggestion - Why not spawn the processes in a worker thread (completely non-blocking)
I did look into this last year, & we can do this today.

What can we move to these worker threads:
Spawning processes just to get their environment information
Spawning processes to run code e.g. python -c "import ipykernel"
& other simple processes

karthiknadig · 2021-03-30T17:28:49Z

@DonJayamanne we should turn this into a meta issue an create work items for each particular case we move into worker. This is great if we can do this where ever possible.

DonJayamanne · 2021-04-02T15:56:58Z

FYI - We have managed to resolve the performance issue in Jupyter notebooks by delaying the loading of the Python extension. Now Jupyter Notebooks open in around 3 seconds, basically the Python extension gets loaded/activated AFTER the notebook is opened.
Hence this isn't a priority anymore, but if we had this, we'd ensure the Jupyter extension doesn't regress & we don't need to worry about other extensions blocking us. (or this extenison not blocking others)

karrtikr · 2023-08-04T00:24:16Z

Spike: Investigate using webworker for Python discovery #21755

Tyriar · 2023-09-15T11:19:36Z

Looking briefly at how the locators work I believe you could make this much faster if you performed this and other similar functions in parallel:

vscode-python/src/client/pythonEnvironments/base/locators/common/resourceBasedLocator.ts

Lines 46 to 63 in 203f58b

    
           public async *iterEnvs(query?: PythonLocatorQuery): IPythonEnvsIterator<BasicEnvInfo> { 
        
               await this.activate(); 
        
               const iterator = this.doIterEnvs(query); 
        
               if (query?.envPath) { 
        
                   let result = await iterator.next(); 
        
                   while (!result.done) { 
        
                       const currEnv = result.value; 
        
                       const { path } = getEnvPath(currEnv.executablePath, currEnv.envPath); 
        
                       if (arePathsSame(path, query.envPath)) { 
        
                           yield currEnv; 
        
                           break; 
        
                       } 
        
                       result = await iterator.next(); 
        
                   } 
        
               } else { 
        
                   yield* iterator; 
        
               } 
        
           }

I may just not be understanding things but from what I hear, environment discovery can take 20 seconds which seems like an outrageous amount of time to me. Have we done a perf trace to see where time is actually being spent when the environment activation is happening?

I believe the problem lies with the spawning of processes, if we spawn proceses, it takes a few ms to just spawn the process, e.g. i know on Windows, spawning a conda process takes a few seconds, & that's blocking (spawning processes in node is synchronous) - but communication is async.

Processes should block main for milliseconds, spawning a single conda process certainly should not be blocking the extension host for seconds.

DonJayamanne · 2023-09-16T03:33:03Z

Processes should block main for milliseconds, spawning a single conda process certainly should not be blocking the extension host for seconds.

yes that’s correct
conda takes 20 seconds but that doesn’t block node, I should have been more clear
I found using worker threads to be an improvement, at least that’s when I prototyped the code

Tyriar · 2023-09-16T14:31:07Z

I had a chat with @karrtikr yesterday and it looks like there's some pretty nice gains to be made by making locators run more in parallel.

karrtikr · 2023-11-15T21:54:21Z

Closing in favor of #22146 (comment).

DonJayamanne added bug Issue identified by VS Code Team member as probable bug triage-needed Needs assignment to the proper sub-team labels Mar 30, 2021

karthiknadig self-assigned this Mar 30, 2021

karthiknadig added triage area-editor-* User-facing catch-all meta Issue that is tracking an overall project and removed triage-needed Needs assignment to the proper sub-team labels Mar 30, 2021

karthiknadig removed their assignment Apr 8, 2021

karthiknadig added needs proposal Need to make some design decisions and removed triage labels Apr 8, 2021

DonJayamanne mentioned this issue Aug 15, 2022

Discovering Python interpreters takes a long time #19669

Closed

karrtikr mentioned this issue Aug 4, 2023

Spike: Investigate using webworker for Python discovery #21755

Closed

karrtikr self-assigned this Sep 12, 2023

karrtikr closed this as not planned Won't fix, can't repro, duplicate, stale Nov 15, 2023

github-actions bot removed the needs proposal Need to make some design decisions label Nov 15, 2023

DonJayamanne mentioned this issue Nov 16, 2023

Use worker threads for launching conda and windows registry when discovering #22146

Closed

github-actions bot locked as resolved and limited conversation to collaborators Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvements using Worker Threads #15813

Performance Improvements using Worker Threads #15813

DonJayamanne commented Mar 30, 2021

karthiknadig commented Mar 30, 2021

DonJayamanne commented Apr 2, 2021

karrtikr commented Aug 4, 2023

Tyriar commented Sep 15, 2023

DonJayamanne commented Sep 16, 2023

Tyriar commented Sep 16, 2023

karrtikr commented Nov 15, 2023

Performance Improvements using Worker Threads #15813

Performance Improvements using Worker Threads #15813

Comments

DonJayamanne commented Mar 30, 2021

karthiknadig commented Mar 30, 2021

DonJayamanne commented Apr 2, 2021

karrtikr commented Aug 4, 2023

Tyriar commented Sep 15, 2023

DonJayamanne commented Sep 16, 2023

Tyriar commented Sep 16, 2023

karrtikr commented Nov 15, 2023