Collective comms API
Collective communications as in OpenFOAM code: The most important API calls to do group-wide comms.
On this page
Lecture video
Module transcript
Now that we introduced collective communication, we’ll want to learn how to use them when needed, so in this module, we will take a look at the code interface to these methods.
I want to mention that I will try and follow OpenFOAM’s terminology when naming things, which is a little different than MPI wording, so if you’re familiar with how things are called in MPI context, it might seem a bit strange at first.
The first one is the gather operation, which is more like an MPI reduce. As you can see, on the first line of this example, each process creates a local boolean named v. Then on the second line, the master process might change the value of this variable.
If you want to check whether master or any other process has changed the value, we can let all processes call Pstream::gather with 2 arguments:
-
The first argument is the variable we want to check.
-
The second argument is an object representing the operation which needs to be applied to the gathered variable, so if we pass an object of type orOp, an or operation will be applied to the local values of v from all processes on the master processes.
There is another variant for a gather operation which is more like a true MPI gather, and it operates specifically on lists.
On the first line of this snippet, we create a local list of booleans. Note that the size of the list is the same as the number or processes involved, because then on the next line, each process acts on the element of the list corresponding to its rank only.
The Pstream::gatherList call, again executed on all processes with the same symbol localLst will collect the corresponding elements from all processes and fully populate the local list of the master process.
Similar to gather, there is a Pstream::scatter operation which takes a single argument and overwrites its value on all processes with the value on the master process.
This acts like a “forced push”, more like a broadcast in MPI terminology.
There is also a scatter variant which behaves more like an MPI scatter, and operates on list containers.
Pstream scatterList simply pushes elements of a list from the master process to all other processes, but it does not overwrite the local element that corresponds to the process’s rank.
You see, this is very important, because, instead of a scatterList, you can do a regular scatter if you want a forced overwrite of all list elements.
In terms of code statistics, gatherList variant seems to be used more often because working with lists while gathering is a recommended practice. But then scatter is used more often. Note that after a gatherList, both scatter and scatterList will have the same effect.
If there is a need to synchronize a variable’s value across processes, OpenFOAM provides special reduce and returnReduce functions which take two arguments: the first one is the value to be reduced, and the second is the operation to apply to this value.
The only difference is that reduce’s first argument needs to be a declared local variable; but returnReduce accepts a temporary as a first argument, for example something returned out of a function call.
In this example, if we pass a variable of type sumOp as the second argument of either functions, reduce will sum up the values collected from all processes and updates the local value on all processes with the resulting sum.
In MPI terminology, this is equivalent to Allreduce.
Going back to our formula 1 pit stop. We can consider the driver checking for the green light and the signal from their teammates to be a reduce operation.
So, what happens when the green light ceases to function or if one of the teammates responsible for takeoff signaling walks away?
Well, at least one of the parties involved will panic.
The same happens to MPI processes if one of the processes decides suddenly to not participate in the collective communication. Usually, this results in a stagnation effect, much like deadlocking.
In this code example, we write a function to refine the local mesh by splitting cells into smaller ones.
While performing this task, we want to keep checking for a global number of cells as we don’t want to exceed a specific limit.
So the trivial way to solve this is by declaring a local variable for total number of cells and reduce it with a sum operation at the end of the loop.
This lets all processes know how many cells we have in total.
The problem arises on line 10 here, what if a process returns from this function, or breaks out of the loop for whatever reason.
In that case, that specific process will not be able to execute the reduce call, and the whole communication will come to a halt.
It may seem trivial to avoid this mistake, but it’s not always this simple, you may have hundreds of nested function calls before reducing, and it can be confusing to workout if calling reduce is ensured in all cases.
Remember this example though as we will consider a more advanced version of it again in a later module.
In this module, we haven’t discussed all API methods for parallel communication, but I believe we got the most important basics down. Next step is to learn how to communicate our own data types.