Killing the supercomputing Hydra

Lernaean Hydra [3]
I guess you all heard about the story of St. George and the Dragon [1]. According to the legend, the Dragon caused many problems to the ancient city of Lasia, somewhere in the near East. The Dragon was really bad, eating maidens every day, and one day he wanted the King’s daughter for dinner. But St. George was a very brave warrior, so he was not afraid. He slayed the Dragon, saved the king’s daughter and the city, and became the legend.
So, I guess, the reader must be wondering what this legend has to do with the supercomputing. Well, when I started to work on CP2K application, I thought I was fighting against the big, mighty Dragon. It turned out that I had problems with more powerful mythical creature – the Hydra [2]. As you remember, Hydra had many heads – as well as the supercomputers I used to build the CP2K application.
The application itself
CP2K [4] is complex scientific application to perform atomistic and molecular simulations of solid state, liquid, molecular, and biological systems. The application itself is written in Fortran 95 with the support for both serial and parallel execution. Parallel builds include MPI, OpenMP, hybrid (MPI + OpenMP) and GPU implementations. So, the application is really heterogeneous, as well as the hardware on which it can be executed. You can look at the CP2K as my mighty enemy – the Dragon.
Compilers and libraries
To build the application, different compilers and libraries are needed, depending on the platform. First, you need fresh GCC (GNU Compiler Collection), version 4.6 or higher or gfortran will complain without a reason (needed some time to figure that out). Fortran compilers are far more sensitive creatures than C compilers, as C language is widely used and well supported. Fortran is the language of choice for scientific community, thus compilers are more prone to changes made in standard (F77, F90, F95…).
Then, you need BLAS, LAPACK, FFTW3, and some project specific libraries (they did not cause any trouble). Most of those you need to build from scratch with classic chain of configure, make, and make install commands. But, it is not that easy if you are not a superuser on the machine, as it needs some tweaking (configure –prefix, for example). Those libraries are barely enough for serial execution. For parallel builds, you need ScaLapack (which, in turn, needs BLACS and LAPACK) to run MPI version, while every build with OpenMP needs recompiled libraries like FFTW3 to support threading.
Here comes the good part. Some of those libraries (e.g. BLAS, LAPACK) are finely tuned for specific processor architectures in the vendor-specific libraries, like MKL (Intel), and ACML (AMD) which make the build process even more complex. For every architecture, you have to pass the compiler the right set of compiler flags or it won’t work at all. In addition, some cluster systems have preinstalled library packages, and on some you are left on your own, even between the different nodes of the same cluster.
Supercomputers
So, we come to my Hydra. I had to install and build the app on several different system (heads!), including HECToR (UK supercomputing national service, 90k AMD processors), Indy (EPCC Industry machine, medium cluster with 1536 AMD cores), and my favourite one (irony!) – Hydra (EPCC development machine, small cluster with ~200 Intel cores). Supercomputing clusters usually consist of frontend (login) and several backends (computational nodes with different capabilities) where you execute your jobs. In order to execute your jobs seamlessly, those nodes should have common (and harmonized) software stack. Which is not always the case.
First I tried to play with the smallest one – Hydra. It was a bad decision, as I had many problems. Hydra is really heterogeneous machine, and software stack is not harmonized between frontend and the backends, meaning that something you can build and execute on the frontend, might not work on the backends (or needs heavy workaround). I gave up for a while, and moved to Indy, which is reasonably well supported machine in terms of software stack, although with a bit outdated, vendor-specific PlatformMPI. I took some time to figure out all compiler flags, since Hydra was an Intel machine, and Indy is an AMD machine, so I had to switch from Intel MKL to AMD ACML core math library. Especially, there were problems with ACML library, since ScaLAPACK did not build with ACML implementation of LAPACK, but rather with the reference one (the slowest one). HECToR supercomputer is best supported in terms of the software stack. I didn’t have many problems with it, although it takes some time to get familiar with a supercomputer of that scale.
Finally, with gained experience, tons of patience and some heroic effort, I went back to Hydra. The builds were done. The beast has gone for good.
——————————————————————————-
- http://en.wikipedia.org/wiki/St._George_and_the_Dragon
- http://en.wikipedia.org/wiki/Lernaean_Hydra
- The picture is borrowed from http://en.wikipedia.org/wiki/File:Gustave_Moreau_003.jpg with accordance to the copyright
- http://www.cp2k.org/
Leave a Reply