The Fast Multipole Method: A Coulomb solver for the masses
There is time for laughs and there is also time for being serious and getting things done. In this blog post I would like to summarize what I have learned about the Fast Multipole Method during this programme. This theoretical introduction is not meant to be exhaustive, nor 100% mathematically proven. The focus was placed on providing a simple explanation of the general functioning of the algorithm so that its high-level workflow can be understood.
“Verily, this vichyssoise of verbiage veers most verbose, so let me simply add that it’s my very good honor to” try to explain this, although you may hate me in the end. Take a seat, coffee, enough food, fasten your seat belts and please do not smoke. If you are not in the mood for reading but you enjoy listening to this kind of stuff, I recommend that you watch my video in which I explain the concepts in a friendly manner:
The simulation of dynamical systems of particles, usually under the influence of physical properties, such as gravitational or electrostatic forces, is a crucial issue in scientific research. This problem is commonly referred as the N-body problem. The main obstacle of that problem is the absence of an analytical solution when the number of bodies, i.e., N , is greater than three total bodies. However, those simulations can be solved using an iterative numerical approach. Therefore, those methods compute the total force exerted on each particle and their potentials at discrete time steps, then compute the resulting velocities and update the positions of the particles accordingly.
A typical example is the simulation of a system of particles with electric charges . The force of a particle with charge acting on a particle with charge is defined by the following expression:
where is the distance vector between particles and .
Given that, the total force acting on each particle can be expressed as the following summation:
As we can observe, calculating the forces acting on a single particle has a computational complexity of since we have to compute all pairwise interactions of the current particle with the rest of the system. Therefore, a naive algorithm for computing all forces exhibits complexity.
The remainder of the simulation steps have a complexity of since computing the velocities by using the forces just needs to iterate once over each particle and the same applies for the position update step. In this regard, the quadratic complexity may be negligible for a small number of particles, but often involve huge particle ensembles so the simulation will be considerably slowed down to a point in which it is non-viable to apply this kind of summation method. Fortunately, due to the increasing importance of N-body simulations for research purposes, fast summation methods have been developed throughout the latter years.
Arguably, the most remarkable one is the , introduced by Greengard and Rokhlin .
This method is considered  together with other notable ones as the , the or the . The mathematician Barry Cipra perfectly summarized the functioning of the algorithm : “This algorithm overcomes one of the biggest headaches of -body simulations: the fact that accurate calculations of the motions of particles interacting via gravitational or electrostatic forces (think stars in a galaxy, or atoms in a protein) would seem to require computations. The fast multipole algorithm gets by with computations. It does so by using multipole expansions (net charge or mass, dipole moment, quadrupole, and so on) to approximate the effects of a distant group of particles on a local group. A hierarchical decomposition of space is used to define ever-larger groups as distances increase.”
In this post, I would like to provide a brief description of the FMM. We will review its core aspects and its mathematical foundations. We will also describe briefly the role of the mathematical operators used for shifting and converting the multipole expansions. At last, we will explain how the application of rotation-based operators to those expansions is able to reduce the complexity of the aforementioned operators from to .
As we previously mentioned, the FMM is a fast summation method which is able to provide an approximate solution to the calculation of forces, potentials or energies within a given precision goal, namely . The method exhibits linear computational complexity, mainly thanks to its sophisticated algorithmic structure. The core aspects of the FMM are the following ones: spatial grouping of particles, hierarchical space subdivision, multipole expansion of the charges and a special interaction scheme.
Spatial Particle Grouping
The main idea behind the FMM can be informally described as the following intuitive concept: the effect that particles, which are close to the observation point (also named target), have over that target particle is dominant compared to the effect produced by remote particles. It is important to remark that the contribution of those remote particles is not zero. Otherwise, we will be resorting to what is known as cut-off scheme. Those methods present a complexity, but lack the error control capabilities of the FMM. That is the main reason why cut-off based methods are not an option for accurate simulations.
Consider the particle distribution shown in Figure 1, in which we want to compute the remote interactions between two different clusters: target, with particles, and source, with particles. The effect of intra-cluster interactions are not taken into account in this example for the sake of simplicity. A traditional summation method requires computing interactions. The FMM is based on the idea that a remote particle from a spatial cluster will have almost the same influence on the target particle as another one from the same cluster, given that the inter-cluster distance is large enough. This means that all particles in the remote cluster can be grouped together into a pseudo-particle. By doing this, we are effectively reducing the amount of interactions to , since each target particle is tested against the remote pseudo-particle once. This grouping scheme can also be used in reverse, by grouping the target cluster thus requiring interactions. In addition, we can also group both source and target clusters, reducing the computation to a single interaction.
Hierarchical Space Subdivision
In order to implement the spatial grouping idea, it is needed to subdivide the simulation space to generate particle groups. The FMM uses a space decomposition scheme based on a recursive decomposition in cubic boxes. Firstly, the simulation spatial domain is enclosed in a three-dimensional box. Then, the cube is subdivided by planes parallel to its faces which pass through the box centroid. This decomposition technique generates eight different child cubes which are recursively subdivided. This hierarchy of cubes is arranged in a tree, namely an octree. The space is subdivided until a predefined depth on that tree. Figure 2 shows an example of this recursive subdivision visualized on a 2D plane.
Special Interaction Scheme
As we previously mentioned, the FMM takes advantage of grouped remote particles to reduce the amount of interactions to be computed. This interaction scheme is the main characteristic of the FMM algorithm. Figure 3 shows the interaction matrix of a distribution of particles using the direct summation method and the resulting interaction matrix by using the FMM.
As we can observe, the direct summation scheme computes a total of interactions. This is due to the matrix symmetry and the dropped out self-interactions. On the other hand, in the FMM scheme the source and target particles are grouped as the distance increases thus dropping the number of interactions significantly.
The immediate neighbors of a certain cell or box are also called its near field and their contributions to the target particle are calculated with the classical direct summation method. The FMM handles a parameter called well-separateness and referred as that represents the maximum distance (in boxes) to the target box for a cell to be considered part of its near field, i.e., it defines the size of the direct neighborhood. In this regard, for an arbitrary value of , there are boxes in the near field of any other box. Figure 4 shows different examples of near fields for a box with non-periodic boundaries.
Multipole Expansion of Charges
In order to accurately depict the charge distribution of the created particle groups, the FMM leverages to multipole and local expansions. A group could be represented by reducing all the charges to a common point, but this is not precise enough to represent the charge distribution inside that cluster. Instead of doing that, the group is represented by a series of increasingly higher-order multipoles. The series is truncated to ta certain order , this parameter controls the final accuracy and balances the computational load.
The solution for the potential of a particle at the spherical coordinates consists of two separate solutions: a multipole expansion with moments for a local part inside the sphere, and a Taylor-like expansion with moments for the outside.
The multipole expansion is defined by the following theorem: suppose that particles of charges with are located at the points with inside the sphere. Then for any with , the potential is given by:
The Taylor-like expansion is defined by the following theorem: suppose that particles of charges are located at the points with outside the sphere. Then for any with , the potential is given by:
Once the core aspects of the FMM have been reviewed, we can put them all together to establish the general workflow of the algorithm. Given a certain separateness criterion , the multipole order and the depth of the tree , the FMM consists of the following steps:
- Pass 1: Expand particles into multipole moments on the lowest level and translate multipole moments up the tree.
- Pass 2: Transform multipole moments into local moments .
- Pass 3: Translate local moments down the tree.
- Pass 4: Compute far field contributions: potentials , forces , and energy .
- Pass 5: Compute near field contributions: potentials , forces , and energy .
The actual phases of the algorithm are usually named . The first step is performed by the Particle-to-Multipole (P2M) operator and it is often considered a preprocessing step. The remainder of the first pass corresponds to the Multipole-to-Multipole (M2M) operator, while the second one is done via the Multipole-to-Local (M2L) operator and the third one with the Local-to-Local (L2L) operator.
As we previously mentioned, the FMM needs three fundamental mathematical operators during its workflow, namely M2M, M2L, and L2L. Those operators are responsible for shifting the multipole expansions up and down the tree levels, and also to convert remote multipole expansions to local ones at each level. In this section we will briefly review those three operators to provide context for the rotation-based operators.
The M2M is a vertical operator which shifts the multipole coefficients up to higher levels of the tree structure. Each box of the 3D tree has eight child boxes in the next lower level. The M2M operator sums up all the moments of the multipole expansions of the child boxes at the center of the parent box. This operator is applied to each level up to the root of the tree. By doing this, each box on every level has a multipole expansion. Since this operator is applied for the first pass, it is also known as .
From a mathematical perspective, each child multipole expansion at the center of that child box is shifted up to the center of its parent box (see Figure 5. The moments of each child multipole expansion are shifted by the operator to produce the moments of the parent’s expansion:
In addition, all the shifted moments of the eight child boxes are finally added up to conform the multipole expansion at the center of the parent box:
The M2L is an horizontal operator which transforms multipole expansions on a level of the tree into local ones on the same level. The multipole coefficients of an external expansion around $a$, are converted into local or Taylor-like coefficients around . The multipole moments of the remote expansion are transformed into local ones by means of the M2L operator, also named :
Figure 6 depicts the M2L operator. It is important to remark that this operator presents a truncation error due to the need for limiting the first summation range. This is usually constrained so that . Additional errors are introduced but analyzing this aspect of the mathematical operators is outside the scope of this work. The reader may refer to  for error analysis of the FMM operators.
The L2L is a vertical operator which shifts local expansions from a parent box to the centers of its children boxes thus acting over different levels of the tree. The Taylor-like moments of a parent box with center at are shifted down also as local moments to the center of a child box . The mathematical definition of this shifting which makes use of the L2L operator, also known as , is defined as follows:
Figure 7 shows the functioning of this operator over a 2D tree, and its analytical domain. This operator also introduces error due to the finite representation of the local expansion (notice that the first summation is limited to poles). It is also important to remark that no additional errors are introduced, so the M2L or operator becomes the most error-prone one.
All the described operators have a computational complexity of . This fact is considerably harmful when dealing with high-precision simulations in which the order of the multipoles is usually large. Taking into account that the three passes take a considerable amount of the total execution time of a FMM step, it is critical to reduce the cost of the mathemathical operators in order to achieve faster simulations.
A set of more efficient operators with computational complexity scaling were proposed by White and Head-Gordon. The reduced complexity is achieved by rotating the multipole expansions so that the translations or shifts are performed along the quantization axis of the boxes (see Figure 8). This reduces the 3D problem to a 1D one.
Along the quantization axis, and so the operators take an especially simplified form . In order to rotate the multipole or local expansions, Wigner rotation matrices are applied. Two successive rotations are performed: first one about the -axis with a rotation angle , followed by another one about the new -axis with a rotation angle (see Figure 9). Those angles correspond to the polar and azimuthal angles of the shift vector , depicted as in Figure 8.a and 8.c.
The multipole moments of an expansion with respect to a coordinate system which has been rotated twice, first and angle about the -axis and then about the -axis, can be expressed as a linear combination of the moments with respect to the original coordinate system:
The same applies for the local moments , expressed as a linear combination of the moments with respect to the original coordinate system:
In both equations, are the Wigner small -rotation coefficients whose computation falls beyond the scope of this work. A detailed explanation and implementation on how to compute them can be consulted at . The term represents a factor that is needed for each moment to compute the rotation. On a normal implementation, the operator will have to compute both terms on the fly, adding a pre-factor to the complexity of this operators. However, that pre-factor can be removed by precomputing and reusing the constants.
And that’s all folks! I hope you find this brief introduction to FMM interesting! All the figures were proudly designed and generated from scratch using LaTeX/Tikz.
 Leslie Greengard and Vladimir Rokhlin. A fast algorithm for particle simulations. Journal of computational physics, 73(2):325–348, 1987.
 Jack Dongarra and Francis Sullivan. Guest editors introduction to the top 10 algorithms. Computing in Science Engineering, 2(1):22–23, Jan 2000.
 Barry A Cipra. The best of the 20th century: editors name top 10 algorithms. SIAM news, 33(4):1–2, 2000.
 Ivo Kabadshow. Periodic boundary conditions and the error-controlled fast multipole method, volume 11. Forschungszentrum Jülich, 2012.
 Christopher A. White and Martin Head-Gordon. Rotating around the quartic angular momentum barrier in fast multipole method calculations. The Journal of Chemical Physics, 105(12):5061–5067, 1996.
 Simon Pfreundschuh. Implementation of a rotation operator for the fast multipole method. Technical report. Forschungszentrum Jülich, 2014.