ULFM User Level Failure Mitigation

ULFM Docker Package

There are many ways to install ULFM. For performance evaluation, large scale experiments and platforms, you should follow the instructions from the Open MPI ULFM Readme. However, for a quick test, or for a small non-performance critical test, one might want to spend time on working on the concepts instead of installing. Thus, we provide a docker image for those who want to quickly test it’s capabilities.

Using the Docker Image

  1. Install Docker
    • Docker can be seen as a “lightweight” virtual machine.Docker is available for a wide range of systems (MacOS, Windows, Linux).You can install Docker quickly, either by downloading one of the official builds for MacOS or Windows, or by installing Docker from your Linux package manager (e.g. yum install docker, apt-get docker-io, port install docker-io, etc.)
  2. In a terminal, Run docker run hello-world to verify that the docker installation works.
  3. Load the pre-compiled ULFM Docker machine into your Docker installation docker pull abouteiller/mpi-ft-ulfm
  4. Source the docker aliases in a terminal, this will redirect the “make”
    and “mpirun” command in the local shell to execute in the Docker machine.
    1. alias make='docker run -v $PWD:/sandbox abouteiller/mpi-ft-ulfm make' alias mpirun='docker run -v $PWD:/sandbox abouteiller/mpi-ft-ulfm mpirun --with-ft ulfm --map-by :oversubscribe --mca btl tcp,self'
  5. Run some example to see how this works. Quick examples can be found in the tutorial examples directory. You can now type make to compile the examples using the Docker provided “mpicc”, and you can execute the generated examples in the Docker machine using mpirun -np 10 example

Have fun!