uog logo
School of Physics and Astronomy – computing
phas it logo

Contents

Python

Python is a very widely-used scripting language: see python.org or the Wikipedia article.

Python is popular because it is reasonably easy to read (for some value of ‘easy to read’), because it has a large library (so whatever software problem you want to address, there is a good chance there is a Python library for it), and because it is popular (ie, ‘everyone uses it’).

Python is a free/open-source language.

The notes below are not an introduction to Python, but intended as a collection of fragments of local advice about it. There is lots of ‘getting started’ advice on the web.

Jupyter §

Within the school, we support a JupyterHub server server, which supports Python notebooks. This is a flexible and fairly easy way to get started, but has a few limitations, which you may start to run into if you use Python extensively.

Installation, and versions §

Depending on the OS on your machine, Python may or may not be already installed. If you need to install it, or if the pre-installed version is too old (see below), there are multiple ways of doing so. For various reasons, we currently semi-recommend Anaconda as the simplest way of doing so.

Anaconda installs a complete Python distribution, separately from any pre-installed version (ie, not replacing it), and installs a tool for managing Python packages, which are the bundles which contain libraries and other additional Python functionality. Using Anaconda will probably work OK for you, but it's a bit of a blunt instrument.

Alternatively, you can use a system Python, and install packages into this (see below for the right and wrong ways of doing this). This gives you more control, but it is possible to mess this up and hobble your installed Python.

If you do not have a pre-installed Python, and do not want to use Anaconda, you can either download it from python.org, or use your system's package manager to install it (eg, yum install python34 on CentOS).

You should use Python 3 for all new code unless you have a very good reason for using Python 2. A system's pre-installed Python may be version 2, for legacy reasons; you can check with python --version. The system Python 3 might be invoked using the command python3 rather than plain python.

It doesn't much matter which version of Python 3 you use, within reason. At the time of writing, Python 3.11 is stable, but Python 3.6 or even 3.4 might be the most recent available in a more conservative Linux distribution, and you'll probably get away with that. Newer is generally better, but it's best to stay away from bleeding-edge versions unless you have a strong need to do so (some packages do specify an ‘oldest supported’ Python version), and do so with your eyes wide open.

Python packages, and virtual environments §

The following discussion presumes that you are aiming for command-line usage of Python. It is possible to use virtual environments on Jupyterhub, but doing so is a little more involved.

There is a huge range of Python packages, some are actively developed, some abandoned; sometimes functionality will be supplied by more than one competing package; sometimes packages are incompatible with each other. It's possible, and indeed relatively easy if you're not very careful, to mess up your collection of Python packages by installing mutually incompatible packages.

The usual Python package installer is pip. Here is some (rather opinionated) advice about using pip:

A Python ‘virtual environment’ is a collection of Python packages which can be added or removed from command-line visibility. You maintain a virtual environment using the Python venv module.

The advantages of using venvs are:

If you need to install a collection of packages for a particular project, say foo, then do so as follows. Note that the current venv package works only with Python 3, so you may need to choose that version with a particular command, python3 in this example.

% cd path/to/project
% python3 -m venv foo

this will create a directory foo which will contain the packages for this project. You can then ‘activate’ that environment with:

% source foo/bin/activate
(foo)% which python
.../foo/bin/python
(foo)% python --version
Python 3.x.x

Note that the prompt changes to remind you that you're using the venv.

After you source the activate script, the command python refers to a version of Python in your virtual environment rather than any system one. This will be true until you close the current shell (or terminal window). When you open a new shell (or terminal), the default python command will refer to the original, unadulterated, one, until you active the venv version using source .../foo/bin/activate.

If you subsequently use pip to install packages:

(foo)% python -m pip install matplotlib

then this uses this ‘venv’ python, and installs the packages within the venv. Note that there is no --user flag here.

You can have multiple such venv structures, containing different versions of package, depending on your requirements. If you mess up the collection of packages, you can delete the whole lot by simply deleting this foo directory and starting again.

If your work is at all sensitive to the collection of packages you use, then you should take care to record that collection of packages as part of the documentation of your scientific work. Do that using pip:

(foo)% python -m pip freeze >requirements.txt

If you preserve this requirements.txt file, then you can recreate the identical set of packages in a future venv using:

(foo)% python -m pip install -r requirements.txt

That is, the requirements.txt file should be checked in to your project code repository as part of your source code (you do use version control, don't you?).