13.8.7. Macro “reoptimizeZoneBiFunMpi.C”
13.8.7.1. Objective
The objective of the macro is to give another example of a two level parallelism program using MPI paradigm. In the former example MPI function call is implicit using Uranie facilities. In this one, explicit calls to MPI functions is done. It’s presented to illustrate the case when the user evaluation fonction is an MPI function.
It takes the former example of zoning problem and adapts it. The intermediate level does not use a
TLauncher2 to run all different arrangements, but encodes it. Each Mpi ressources evaluates
different possible arrangements keeping its best one, and Mpi reduce these results to the final
result.
13.8.7.2. Macro Uranie
The low level evaluation function is the same than in previous example and is not shown again.
using namespace URANIE::DataServer;
using namespace URANIE::Relauncher;
using namespace URANIE::Reoptimizer;
using namespace URANIE::MpiRelauncher;
#include "reoptimizeZoneCore.C"
#include "reoptimizeZoneDoe.h"
struct mpiret {
double val;
int id;
};
int doefun(double*, double*);
void reoptimizeZoneBiFunMpi()
{
ROOT::EnableThreadSafety();
int i;
// inputs
TAttribute z1("zone1", 0., 1.);
TAttribute z2("zone2", 0., 1.);
TAttribute z3("zone3", 0., 1.);
TAttribute z4("zone4", 0., 1.);
TAttribute *zo[] = { &z1, &z2, &z3, &z4, NULL };
// outputs
TAttribute diff("diff");
TAttribute v0("v0");
TAttribute v1("v1");
TAttribute v2("v2");
TAttribute v3("v3");
TAttribute v4("v4");
TAttribute v5("v5");
TAttribute v6("v6");
TAttribute v7("v7");
TAttribute v8("v8");
TAttribute v9("v9");
TAttribute *out[] = {
&diff, &v0, &v1, &v2, &v3, &v4, &v5, &v6, &v7, &v8, &v9, NULL
};
// fonction
TCJitEval fun(doefun);
for (i=0; zo[i]; i++) fun.addInput(zo[i]);
for (i=0; out[i]; i++) fun.addOutput(out[i]);
fun.setMpi();
// runner
//TThreadedRun runner(&fun,8);
//TSequentialRun runner(&fun);
TBiMpiRun runner(&fun, 3);
runner.startSlave();
if (runner.onMaster()) {
TDataServer tds("tdsvzr", "tds4optim");
fun.addAllInputs(&tds);
//
TVizirGenetic gene;
gene.setSize(300, 200000, 100);
//TVizirIsland viz(&tds, &runner, &gene);
TVizir2 viz(&tds, &runner, &gene);
//viz.setTolerance(0.00001);
viz.addObjective(&diff);
viz.solverLoop();
runner.stopSlave();
tds.exportData("__coeurM__.dat");
}
}
int doefun(double *in, double *out)
{
int i, id, size, fid, tag;
MPI_Comm comm;
double z[7], one[11], two[11];
double *cur, *mem, *swp;
struct mpiret ret, res;
comm = URANIE::MpiRelauncher::TBiMpiRun::getCalculMpiComm();
MPI_Comm_rank(comm, &id);
MPI_Comm_size(comm, &size);
// const
z[0] = in[0];
z[1] = in[1];
z[2] = in[2];
z[3] = in[3];
// local
mem = one;
cur = two;
i = id;
z[4] = doe[i][0];
z[5] = doe[i][1];
z[6] = doe[i][2];
lowfun(z, mem);
for (i = id+size; i<DOESIZE; i+=size) {
z[4] = doe[i][0];
z[5] = doe[i][1];
z[6] = doe[i][2];
lowfun(z, cur);
if (cur[0] < mem[0]) {
swp = mem;
mem = cur;
cur = swp;
}
}
// global
/* where is min */
ret.val = mem[0];
ret.id = id;
MPI_Allreduce(&ret, &res, 1, MPI_DOUBLE_INT, MPI_MINLOC, comm);
/* get min extra datas */
if (res.id != 0) {
if (id == res.id) {
MPI_Send(mem, 11, MPI_DOUBLE, 0, 0, comm);
}
else if (id == 0) {
MPI_Recv(out, 11, MPI_DOUBLE, res.id, 0, comm, MPI_STATUS_IGNORE);
}
}
else {
for (i=0; i<11; i++) out[i] = mem[i];
}
return 1;
}
The top level function (reoptimizeZoneBiFunMpi) does not change from the previous example and
defines a TBiMpiRun instances.
The evaluation MPI fonction (doefun) is totaly different. It uses the class method
URANIE::MpiRelauncher::TBiMpiRun::getCalculMpiComm to get the MPI Communicator object (MPI_Comm)
dedicated to the calcul ressources. With it, different call to MPI functions can be done :
MPI_Comm_rank and MPI_Comm_size to get the context ; MPI_Allreduce, MPI_Send and MPI_Recv
to communicate between calculs ressources.
Note that the evaluation function is predeclared and defined after the top level function. It is a
trick for cling (the ROOT jit compiler) to know the MPI function : when it compiles, it sees the
TBiMpiRun before, and loads MPI librairies. With a real MPI user code, which needs a library, cling
cannot be used. User needs extra works to make it run (build a standalone program or a ROOT compatible
library), but the principle presented before is suitable.
You may run this script the same way as the precedent example, with the same constraint on the
ressource number. Also note that this version is really faster than the previous one, avoiding
creation and manipulation of a TDataServer.