8.4.3. TMpiRun
In this case, many processes are started on different nodes. MPI uses the distributed memory
paradigm: each process have is own address space. All processes run the same macro and define their
own objects. If you create a big object in the evaluation/master code section, all processes
allocate it (this is why, generally, the main dataserver object is created in the onMaster part to
prevent from creating as many dataserver as there are slaves).
the constructor calls
MPI_Initfor the initial process synchronisation. This step is automatical, as long as one is running through the on-the-fly C++ compilator thanks to therootcommand or in python.startSlaveeither exits immediately for the master process (id=0) or starts evaluation loop for other ones.depending if we are on the master process or not,
onMasteris true or false.stopSlaveputs fake items for evaluation and then exits. Evaluation processes get it, stop their loop, exit fromstartslave, and usually jump the master bloc instructions. Unlike threads, the master process is not waiting for evaluation processes.the destructor calls
MPI_Finalizefor the final process synchronisation. In the specific case of python, where ROOT and python (though the garbage collector approach) are arguing to destroy objects, a specific line (ROOT.SetOwnership(run, True)) has to be added, as discussed in Macro.
TMpiRun constructor has one argument, a pointer to a TEval object.
# Creating the threaded runner
mrun = Relauncher.TMpiRun(code)
To run a macro in a MPI context, you have to use the mpirun command. Here is a simple way to run
our example:
mpirun -n 8 python RosenbrockMacro.py
Here, we launch root on 8 cores (-n 8). The mpirun command has other options not mentioned here.
In general, one runs a MPI job on a cluster with a batch scheduler. The previous command is put in a shell script with batch scheduler parameters. The ROOT macro does not use viewer, but saves results in a file. They will be analysed in a post interactive session using all the ROOT facilities.
Warning
The TMpiRun implementation requires also at least 2 cores (one being the master and the other one
the core on which assessors are run). If only one core is provided, the loop will run infinitely.