Bratislava: Chapter 2 out of 3
How to MPI
So… In order to face the segmentation fault, Katerina performed changes in the allocations .. But not only in the makefiles, but into the input files too.. And that defeated the segmenation fault.
But now she should start parallelizing the code. So the first thing she needs to do is to see what should be parallelized.. When she goes through the code and sees how many routines have to be parallelized to optimize the code she gets pretty anxious.. ‘This is too much. There are thousand lines of code! I can’t make it.’ she thinks. But then, Jozef, the co-creator of this monsterous code helps her to take it easy.. ‘Start with these routines and we ‘ll see how this is working out by parallelizing one thing every time.’, he sais. And that is what she does.. Katerina figures out which loops take longer to perform the calculations included in the routines ..
‘I should better start from the outer loop. There will be less implications. And I will be probably able to cope up with the inner loops as soon as I succeed at the first one. So I should better do this:
First of all, I need to include the mpif.h into my subroutine. Hm.. As far as I can tell within this part of the code, some files are opened and closed. This should be done only on master to ameliorate the performance and make sure that no information gets lost. What else.. I should also not forget to broadcast the commons that are used in other parts of the code, since wrong values of the parameters may occur..
After that, I should write a slave subroutine, which is going to call the routine I am modifying within it. This can be tricky.. So I should better see in what other parts of the code my subroutine is called. I have to do this to be sure that I am passing through the right variables into the slave routine. That sounds like a good plan so far.’
Of course through the process, she sees that she has to allocate into the slave routine some of the variables of the subroutine she is trying to parallelize, because they came up to be workarrays. But that wasn’t a problem. The monsterous code accepted these changes almost gently, since only come compilation errors occured. Even when she used MPI_Bcast both on the subroutine she is modifying and to the slave routine she wrote, through allocatable sendbufs, there were only some minor compilation errors.
She was so pleased that she kept going, thinking that everything is going to be as easy.. She modified the loop in order to be able to perform each iteration on a different processor. And after the end of each computation, the MPI_Gather command would be sending the results to the master.. But of course that didn’t turn out very well..
While the program was executed, it was hanging, even though the serial version was running (she could see that via the #ifdef ‘s she used)..
To Be Continued..