Skip to content

Anaconda only Python Installations

Luke R edited this page Mar 13, 2020 · 6 revisions

Handling Python installed by Anaconda

The Problem

Specifically for Jupyter kernels that are loaded via Python - like the IPython kernel - it is launched via the Python executable. What we have found is that if you do an installation of Anaconda that doesn't add entries to your PATH environment variable (which is the recommended setup), StatTag can launch the IPython kernel but it will not fully load.

Remember that a Jupyter kernel is launched as an executable, using the argv values in the kernel's kernel.json file. This can work on an Anaconda-only install because the kernel.json may contain the fully qualified path to Python to be able to invoke it. For example:

{
 "argv": [
  "C:\\Users\\Win7\\Anaconda3\\python.exe",
  "-m",
  "ipykernel_launcher",
  "-f",
  "{connection_file}"
 ],
 "display_name": "Python 3",
 "language": "python"
}

So even though python.exe isn't in our PATH, we can run it because the full path is specified. However, there are other dependencies that a Jupyter kernel has, which in turn requires other Anaconda paths. Specifically, we can see this with the ZeroMQ library. If we run Python using the full path, then try to import the ZMQ library, we can get an error:

Python 3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)] :: Ana
conda, Inc. on win32

Warning:
This Python interpreter is in a conda environment, but the environment has
not been activated.  Libraries may fail to load.  To activate this environment
please see https://conda.io/activation

Type "help", "copyright", "credits" or "license" for more information.
>>> import zmq;
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Win7\Anaconda3\lib\site-packages\zmq\__init__.py", line 47, in
<module>
    from zmq import backend
  File "C:\Users\Win7\Anaconda3\lib\site-packages\zmq\backend\__init__.py", line
 40, in <module>
    reraise(*exc_info)
  File "C:\Users\Win7\Anaconda3\lib\site-packages\zmq\utils\sixcerpt.py", line 3
4, in reraise
    raise value
  File "C:\Users\Win7\Anaconda3\lib\site-packages\zmq\backend\__init__.py", line
 27, in <module>
    _ns = select_backend(first)
  File "C:\Users\Win7\Anaconda3\lib\site-packages\zmq\backend\select.py", line 2
8, in select_backend
    mod = __import__(name, fromlist=public_api)
  File "C:\Users\Win7\Anaconda3\lib\site-packages\zmq\backend\cython\__init__.py
", line 6, in <module>
    from . import (constants, error, message, context,
ImportError: DLL load failed: The specified module could not be found.
>>>

Pieces of what it needs can't be found because they aren't in the path. The Python interpreter gives us a nice warning about this at least, telling us that the environment wasn't loaded.

What are our options?

There are a few different ways that we could handle this, depending on what we want to take on, and what we want to impose on the user.

  1. Look at the python path. If it includes “Anaconda”, we can infer it’s an Anaconda installation and format the commands we need. This assumes Anaconda is set up in some specific ways, which may not be the case on everyone’s machine. It basically means the user doesn’t have to do anything though.
  2. Ask the user to tell us that they are using an Anaconda-only Python setup (like through the User Settings dialog). This explicitly tells us that the user is in this situation, and they can tell us the base Anaconda path. It’s more explicit for StatTag, but requires extra work for the user.
  3. Provide instruction to the user that they need to modify their system variables so that the Python path and everything else is in the PATH. This will be more cumbersome to the user, but means no changes would be needed to StatTag.
  4. Let users know that they have to install Python separately so that we can access it. Similarly, this requires not changes to StatTag, but gives the user more work and may mess up their Anaconda environment.
  5. Defer support for Anaconda-only environments. We’re not sure how many potential users would be in this situation, so we may end up isolating all potential users.

There are pros and cons to the different options we've outlined. We will revisit our final approach if we identify problems (and document it here so we know why it didn't work).

So what are we going to do?

We want to try and minimize burden to our users, so our approach is to identify if the user is in an Anaconda setup. Although we can't rely on this information being in the PATH variable, we can turn to the Windows registry.

First, we will look to see if there is are entries under (in the order we will check)

HKEY_CURRENT_USER\Software\Python\<Company>\<Tag>
HKEY_LOCAL_MACHINE\Software\Python\<Company>\<Tag>

As noted in PEP-0514, <Company> is PythonCore for an official release. Today we can see that ContinuumAnalytics is the name for our Anaconda installation, but instead of coding to that we will enumerate the non-PythonCore company names to see if any underlying <Tag> entries have Anaconda as part of the tag name. If so, we are assuming that an Anaconda installation is present.

One consideration - it's possible for Anaconda to be installed and for Python and Jupyter kernels to be working correctly on the user's system without any changes. We need to figure out how to identify that it's "not working", and then employ this check after the fact. The other option is to have the user explicitly denote that they are in an Anaconda-only environment via settings.

Within the project, we have a new RegistryService class that helps discover if we have Anaconda installed. The RegistryService class is used by KernelManager. It starts by taking whatever the executable is from the kernel.json file. If that path ends with python.exe, we assume we need to check if running Python requires any special setup.

This is where RegistryService comes in to play. It will look first under HKEY_LOCAL_MACHINE, and then HKEY_CURRENT_USER at the SOFTWARE\Python registry key path. It will descend into all of the sub-keys under that path until it finds one that contains 'Anaconda' anywhere in the name.

KernelManager gets back this key. If we find a key that has 'Anaconda' in the name, we then get the default value of the InstallPath sub-key. This gives us the directory where Anaconda executables can be found. Just to make sure, we check that the directory exists. If it does, we move ahead assuming we have an Anaconda environment present.

We don't want to do work if we don't have to, so we do a quick test launching python.exe -h from the command line. If this fails to run, we can assume Python doesn't run from the default shell, and it requires the special Anaconda shell setup. Using the base Anaconda directory, we craft a new command and set of arguments for launching the kernel. Instead of running Python directly like this:

C:\Anaconda\Path\python.exe <arguments>

We run a chained set of commands to enable the Anaconda environment and then launch Python. Note that this means sending cmd.exe as the actual process command instead of Python:

cmd.exe /C C:\Anaconda\Path\condabin\activate.bat C:\Anaconda\Path\ & C:\Anaconda\Path\python.exe <arguments>

From here, we just hope things work. This was tested on a few different environments - some with Python only (no Anaconda), some with Anaconda only for the user, some with Anaconda installed for the system.

NOTE This approach uses a lot of conventions and assumptions. It seemed like a reasonable place to start, but we may find this needs to adjust over time.