Goodbye FMM!

Goodbye FMM!

My Summer of HPC has been centered around the FMM algorithm which has been roughly explained in a previous post. Studying its fundamentals and its sequential implementation was very interesting but it’s time to say goodbye. I will use this post to give a brief overview of my experience.

Getting started

The first task was to get familiar with FMM terminology and the basic principles that inspired it, some of which we’ve already covered. After that, we studied the layout of the octree that enables the reduced complexity that FMM offers. This was specially important because our goal was to come up with a MPI version in order to leverage a large amount of nodes.

Some difficulties such as dealing with their complex representation parametrized by many template parameters, or deciding how to distribute those octree nodes in a sensible manner motivated us to focus on the near field pass of the algorithm.

Speeding up the near field pass

The near field pass is pretty similar to a naive Coulomb solver, they only differ in that the near field pass focuses on a box and its close neighbours. Our first approach was to come up with the simplest implementation possible and measure how big the communication times grew as we scaled up in the number of nodes.

The results showed that the simplest schema performed surprisingly well on the data distribution part but extremely badly in the results collection. This indicates that broadcasting is very well optimized and that if we want to scale up in number of nodes we have to fix the reduction times.

At this point the hypothesis were three:

  • The message size is too big
  • Perhaps using a Reduce operation somehow chained the delays eventually adding up to a significant amount when the number of nodes grew.
  • There is a load imbalance in spite of the workload being highly homogeneous

Sadly every explanation I came up with proved to be wrong: there wasn’t load imbalance and greatly reducing message size and communicating it directly to the master node didn’t provide any significant speedup.

Conclusions

Our results show that our goal is not as easy as one might think at first. This is not good news but it also means that there is much work left to be done, and that should make you happy! Weird and complex problems leave you room to try crazy ideas and that is definitely fun.

On a personal note, Summer of HPC showed me the difference of tackling a very complex problem without a clear path versus solving a guided university assignment. This change can sometimes be frustrating but it is what it is. Doing research is difficult and you can’t expect to solve everything at first sight.

The experience has been great and much fun, and I would like to thank Ivo and the SoHPC team for making this possible.

Tagged with: , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.