If you require python modules on farm machines that are not yet available there are 3 possible ways to get this fixed:

  1. Check if the module has been packaged by Debian and if it is available for the current release.
  2. Install it in your home directory.
  3. Use Python VirtualEnvs to install your modules independently from the ones already available on your Linux installation.

Checking if the module is available as Debian package

Python module packages in Debian are named python-MODULENAME, e.g. for sphinx the Debian package is DebPkg:python-sphinx. As a general rule we can only provide modules and versions that are available in the official Debian package repositories since we lack the man power to build and maintain other ones.

If the module you want is available as a Debian package you can check the available version by runningn apt-cache policy PACKAGENAME:
$ apt-cache policy python-sphinx
  Installed: 1.1.3+dfsg-4
  Candidate: 1.1.3+dfsg-4
  Version table:
 *** 1.1.3+dfsg-4 0
        500 wheezy/main amd64 Packages
        100 /var/lib/dpkg/status

The output of the command states that Debian package version 1.1.3+dfsg-4 of python-sphinx is currently installed which corresponds to upstream version 1.1.3.

$ apt-cache policy python-sphinx-issuetracker
  Installed: (none)
  Candidate: 0.8-1
  Version table:
     0.8-1 0
        500 wheezy/main amd64 Packages

In this case Debian package version 0.8-1 is available but not installed; 0.8-1 corresponds to upstream 0.8.

If you require a python library that is available but not installed, create a ticket to including the name of the package and your local machine.

Installing modules into your home directory

Setting environment variables

If you decide to install python libraries to your home directory you will probably need to extend your python load path. This can be done via the environment variable PYTHONPATH which by default is not set. As long as it is not set, your python interpretet will usually only look for modules in certain system directories. You can find out what these are by running the following command:

$ python -c "import sys; print sys.path"
['', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PIL', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7']

If you e.g. install modules to ~/mypythonmodules and want them to be included in your load path you can do it like this:
$ PYTHONPATH=~/mypythonmodules python -c "import sys; print sys.path"
['', '/u/yourusername/mypythonmodules', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PIL', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7']

Since setting the variable each time you run python is annoying you can either export this variable in your current shell by running
$ export PYTHONPATH=~/mypythonmodules

or set it for every new shell by adding that line to your ~/.profile or ~/.bashrc.

If you install modules in your home dir and forget to set PYTHONPATH you might experience errors like
ImportError: No module named mymodule

Installing via pip

The default way to run pip would be to execute pip install MODULENAME however this will probably fail because without further options pip will try to mess with system directories.

Instead think about where in your home directory you would like to store your python stuff and pass this location as a prefix option like this (example installs sphinxcontrib-bibtex +dependencies to ~/python):

$ pip install --install-option="--prefix=~/python" sphinxcontrib-bibtex
Downloading/unpacking sphinxcontrib-bibtex
  Downloading sphinxcontrib-bibtex-0.3.4.tar.gz (50Kb): 50Kb downloaded
  Running egg_info for package sphinxcontrib-bibtex

...skipped output ...

Requirement already satisfied (use --upgrade to upgrade): PyYAML >=3.01 in /usr/lib/python2.7/dist-packages (from pybtex>=0.17->sphinxcontrib-bibtex)

...skipped output ...

    Skipping installation of /u/yourusername/python/lib/python2.7/site-packages/sphinxcontrib/ (namespace package)

...skipped output ...

Successfully installed sphinxcontrib-bibtex latexcodec pybtex pybtex-docutils six oset
Cleaning up...

I chose this example and these excerpt of the output to highlight a few things:
  • pip will start downloading the package and then check its dependencies
  • If pip recognized that some dependencies are already available (in the example PyYAML) it will skip installation
  • pip will not not install files for so called namespace packages. This will cause import sphinxcontrib.bibtex to fail, since sphinxcontrib which includes a directory bibtex does not have a file and will therefore not be recognized as a module
  • /u/yourusername/python/lib/python2.7/site-packages is the directory you will need to add to your PYTHONPATH

After installation you will need to do the following steps:
  • Set your PYTHONPATH
  • Add an empty file for all namespace modules
  • (Optional) If your python modules also install scripts to ~/python/bin you might want to extend your PATH

In the example above this is achieved by running the following commands:

$ export PYTHONPATH=~/python/lib/python2.7/site-packages
$ touch $PYTHONPATH/sphinxcontrib/
$ # optional: set PATH
$ export PATH=~/python/bin:$PATH
$ # optional: set PYTHONPATH in your .bashrc.
$ echo export PYTHONPATH=~/python/lib/python2.7/site-packages >> ~/.bashrc

Use Python VirtualEnvs

What is a Python virtual environment?

At its core, the main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of what dependencies every other project has.

On the GSI Linux desktops the virtualenv command is available for Python version 2 and version 3. In particular, the following three packages should be installed on your machine:
  • python-virtualenv
  • python3-virtualenv
  • virtualenv

An example: install Tensorflow

The following is an example on how to install the Tensorflow framework from Google using Python VirtualEnvs.
  1. mkdir /data.local1/tensorflow
  2. virtualenv --system-site-packages /data.local1/tensorflow
  3. source /data.local1/tensorflow/bin/activate (for Bash or Zsh). You should see that the previous command has modified your shell prompt into something like: (tensorflow)$
  4. pip install --upgrade tensorflow
The step '4' will take a while because pip has to download all the dependencies and compile everything.

NOTE: in order to use Tensorflow you'll always has to use the command specified at step 3.

If you wish to leave the Python virtual environment you'll just need to issue the following command: deactivate


A short list of references about Python virtual environments:

-- BastianNeuburger - 19 Jul 2016

-- MatteoDessalvi - 09 Oct 2017
Topic revision: r5 - 2019-06-03, MatteoDessalvi