Try the Docker packaged ULFM fault tolerant MPI

To support the SC’16 Tutorial, we have designed a self contained Docker image. This packaged docker image contains everything you need to compile, and run the tutorial examples, in a contained sandbox. Docker can be seen as a lightweight virtual machine, running its own copy of an operating system, but without the heavy requirement of a full-blown hypervisor. We use this technology to package a very small Linux distribution containing gcc, mpicc, and mpirun, as needed to compile and run natively your fault tolerant MPI examples on your host Linux, Mac or Windows desktop, without the effort of compiling a production version of ULFM Open MPI on your own.

Content:

1. A Docker Image with a precompiled version of ULFM Open MPI 1.1.
2. The tutorial hands-on example.
3. Various tests and benchmarks for resilient operations.
4. The sources for the ULFM Open MPI branch release 1.1.

Using the Docker Image

1. Install Docker
You can install Docker quickly, either by downloading one of the official builds from http://docker.io for MacOS and Windows, or by installing Docker from your Linux or MAcOS package manager (i.e. yum install docker, apt-get docker-io, brew/port install docker-io). Please refer to the Docker installation instructions for your system.
2. In a terminal, verify that the docker installation works by running

3. Unpack the package:

3. Load the pre-compiled ULFM Docker machine into your Docker installation:

4. Source the docker aliases, which will redirect the “make” and “mpirun” command in this terminal’s local shell to execute the provided commands from the Docker machine.

5. Go to the tutorial examples directory. You can now type make to compile the examples using the Docker provided “mpicc”, and you can execute the generated examples in the Docker machine using mpirun -am ft-enable-mpi -np 10 example. Note the special -am ft-enable-mpi parameter; if this parameter is omitted, the non-fault tolerant version of Open MPI is launched and applications containing failures will automatically abort.

Have fun!