How to make pizza with hybrid programming!
Today I’ll explain you how hybrid programming can cook the biggest pizza in the world. Our job for today is to cover the surface of Europe with pizza! The task is: “ Make a pizza based on the geospatial coordinates where the pizza is being prepared, so that you will use only local ingredients and toppings typical of the region, yummy!” So if the pizza is being cooked in Spain it will have something like “Jamon serrano”, or in Austria it could be “Wiener Schnitzel”.
To achieve this we will need several teams and being organized will be your prerogative.
You will need to split the area of continental Europe in regions, so each team will take a piece resulting from the problem’s domain decomposition.
My idea is to make teams like I’m used to do with MPI, plus OpenMP, and this is actually a way of doing hybrid programming, but wait I’ll tell you more! Every team will have a head chef, or MPI process, and we will call them through a number, or rank. I will fix in the beginning how many chefs I want and where I want to pin them on the map the regions. Every chef gets to choose his helpers but I’ll fix the number (hey I’m still paying them and it’s expensive!) and he has a limit himself. We call them “threads” and they will split between them the workload of the region let’s say in counties.
Each MPI process… Hmm sorry… Chef… looks around and he discovers that in directions North, South, East and West he can communicate, e.g. process 4 can communicate with chef 3,5,7,1.
So they understand that there is some kind of ordering and they believe they are in a topology. They can send and receive any mail or mail boxes with ingredients to and from every other chef, but the post service is faster between neighbors, hence to keep this advantage we will keep communication just between them.
In order to solve this kind of problems I could use a stencil code. It’s an algorithm where the chef and his helpers perform a sequence of sweeps through a grid, the region given to each chef. They divide it in 1 m2 squares and these can be referred to as cells. In each sweep, the team updates all elements of the grid depending on a function or rule on the neighboring elements in a fixed pattern (called stencil). In stencil codes you can modify your cells in place, or have a copy of the grid that you will substitute at the end of sweep. Boundary cells sometimes have to be adjusted during the accomplishment of the task as well. The teams will stop when a certain error coefficient will be small enough, this coefficient measures how far they are from my idea of pizza.
The teams start by setting the grid to an initial value, margarita pizza, or zeros in computer memory terms.
The rule for putting the topping will be a function of the GPS coordinates (x,y) where the worker is located in that moment, the neighboring cells, and there will be some rules for our purpose regarding toppings that cannot be close to each other, so that food will be delicious. For example you can’t put onions close to more onions, or ananas close to ham (I’m joking do whatever you want for this last one!).
The stencil for our workers will be like this:
In order to put a topping in a cell (x,y) we will need to know the other cells highlighted in the image and act accordingly. As the teams move along the grid they will get to the boundaries between regions.
Now teams will have to deal with interconnected problems, since the topping rule will affect their work on the boundaries. Neighboring teams will have to communicate and discover what the closest coworkers are doing behind the border walls.
Since the dough’s from each region are not really touching, each team has to add a stripe of dough that will be duplicating the part of pizza of the other team that they can’t connect to, so that it will be possible to keep track of what ingredients the other regional team has been using. We will call these stripes “Halo data”. This is a waste of dough (or memory data) because it’s just a duplication of what has already been done. N.B. A Dirichlet condition means that the cells that follow that have the same constant value, like a stuffed crust contains the same constant ingredient along the perimeter of the pizza.
To update the halo data, each MPI process will send and receive from another MPI process. The starting position and arrival of data/ingredients is exactly as in the image in a 2D problem.
Now they will know what to put or not put close to those ingredients based on the stencil. This communication is time consuming. Every time there is halo communication everything stops until the Chef knows what the others workers placed (post service has its own latency) and then placing the ingredients on the halo takes also some time.
BONUS Overlap: Can we do faster?
We could overlap communication and computation. “Openmp threads”, the helpers, can not only divide the work to make everything faster, but they can do “tasking”. It means that one of them takes some time in the beginning of an iteration to define a task for him and his colleagues, an action to accomplish with some resources in some area of the region whenever they can, in our case the boundary communication and “computation”. In the next iteration some of the them will perform the stencil on the inner cells and some will do the boundaries tasks and then go back to work on the stencil as soon as possible. Eventually when they will reach the boundaries of the internal region, the communication will be already done.
For today this is already a lot of stuff… See you next time!!!
For news on the project and my personal achievements you will find everything here!