Regression tests for molecular dynamics simulation software
As promised, here is an update of my Summer of HPC adventure. For the past two weeks, I have been working in the project 2017 – Benchmarking and performance analysis of HPC applications on modern architectures using automating frameworks “at” SURFSara.
From day one, I’ve been able to log in and run jobs in Cartesius, the Dutch supercomputer, and I must admit that the experience is being even better than expected. Having access to such processing power from my bedroom makes me feel like a child who has just received the toy he has been waiting for years.
At first, I got familiar with the system and with the framework ReFrame for automating regression tests in HPC systems. The usage of this kind of tools is becoming increasingly important and necessary due to the variety of competitive hardware architectures caused by the evolution of HPC. Thanks to these frameworks, HPC maintainers are able to detect in a simple, reliable and automatic way whether different HPC software or scientific programs are behaving as expected in a range of hardware solutions. Moreover, it is also possible to configure several toolchains and builds to check how the choice of one or the other affects the final execution. Thus, HPC specialists can not only detect possible failures in the systems but also benchmark the performance of the programs and find potential bottlenecks.
Once I learnt the basics of ReFrame, I was assigned an exciting task: the creation, from scratch, of regressions tests for a widely used scientific software in HPC systems: NAMD. According to the official page, NAMD is “a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems”. Moreover, this software is involved in massive simulations of the SARS-COV-2 coronavirus envelope (more info here).
Therefore, during these first weeks, I have been designing different regressions tests for that application, using public benchmarks to test measure the performance of Cartesius. Moreover, I could test different architectures and even computing accelerators like NVIDIA GPUs. Once I ran all the tests, I obtained interesting insights from the results. With this information, I wrote a series of guidelines for future Cartesius users, so that they could execute their NAMD simulations in the best possible way.
As we have already met the objectives proposed for the NAMD tests, this week I have started to perform a similar study for Alya, which is part of ComBioMed Center of Excellence. I am really excited about this task because I will also have to perform detailed profiling to detect possible bottlenecks in the Cartesius system!
In the next post, I will inform you of my progress in this new challenge. Stay tuned, so you don’t miss anything.