Benchmarking and performance analysis of HPC applications on modern architectures using automating frameworks
Project reference: 2017
In the race for exascale, supercomputer architectures evolve fast and the variety of competitive hardware solutions has made benchmarking an increasingly important and difficult task for HPC specialists. Today, there is a growing interest from supercomputing centres in easy to use portable frameworks automating the labour intensive tasks of compiling, testing and benchmarking scientific applications.
In order to support the Dutch research community, by providing well tested and optimised applications on our high performance computing infrastructure, here at SURFsara we are using automated workflows for the deployment of the full software stack on our HPC systems.
These pipelines make use of:
- Easybuild and Jenkins for software building, installation and continuous integration.
- XALT for tracking and monitoring of software and resources usage.
- Reframe for regression testing and performance analysis.
This project aims at improving the services HPC centres offer to the European computational researchers, providing support to efficiently run their simulations on modern computing architectures and helping them in making motivated choices to adapt to the fast evolution of the HPC systems and hardware.
Using the information gathered with XALT on SURFsara systems, we will identify relevant HPC codes and their main usage (underlying libraries, execution patterns, etc.). This will allow us to select of a meaningful set of applications, which we will then integrate in the automated regression and performance testing framework (Reframe) for the production of detailed benchmarks and performance profiles on different HPC systems and architectures. The outcome of this work will be essential to better understand the performances of the most relevant HPC software using state of the art performance measurement frameworks, and to write recommendations for efficient deployment and usage of the codes on the HPC systems.
For this work the student will have access to SURFsara’s HPC systems (Cartesius supercomputer and Lisa cluster) where an instance of XALT is already deployed. In addition the student will deploy and benchmark the selected applications on different systems to extend the workflows and compare performances across different systems. Depending on availability, these may include AMD and ARM test systems, as well as systems of other European HPC sites with which SURFsara has established collaborations. SURFsara is indeed involved in several European projects where benchmarking, co-design, and performance tuning play an important role. Depending on the interests of the intern, the selected applications, and the needs of the stakeholders within the different initiatives, the work of the intern can be linked to one or more of the following projects:
- CompBioMed – Support the Computational Biomedicine community and its diverse set of applications, users and usage scenarios in its High Performance Computing (HPC), High Throughput Computing (HTC), and High-Performance Data Analytics (HPDA) needs.
- EPI – designing low-power European processors for extreme scale computing, high-performance Big-Data and emerging applications. SURFsara is involved in the co-design task.
- PRACE Preparatory Access project 5047 with applied reasearch institute Deltares – enabling faster computations on supercomputers of a state of the art hydrodynamics modelling suite.
The intern will be fully integrated in the Supercomputing team at SURFsara, and the produced results and outcomes will be used directly to improve the supercomputer’s ecosystem and our users experience. If the intern shows particular interest and skills, part of his/her work could be devoted to more in-depth profiling and tuning in the context of one of the above-mentioned projects.
Project Mentor: Maxime Mogé
Project Co-mentor: Sagar Dolas
Site Co-ordinator: Carlos Teijeiro Barjas
In addition to discovering what working at a supercomputing centre looks like, the intern will learn about the main characteristics of current and emerging HPC architectures, the usage of modern automation tools for software building, testing, continuous integration and benchmarking. He/she will get hands on experience with porting, profiling and tuning scientific software, with the outcome of his/her work being directly applicable for the benefit of SURFsara’s users.
Student Prerequisites (compulsory):
- Basic Unix commands.
- Knowledge of Linux as development environment.
Student Prerequisites (desirable):
- Basic knowledge of python.
- Basic experience with software compilation.
- Experience with regression testing and benchmarking.
- Knowledge of computer and HPC systems architecture
Main training material:
- PRACE Supercomputing MOOC https://www.futurelearn.com/courses/supercomputing
- Reframe Slides [pdf] @ HPC System Testing BoF, SC‘19.
Additional training material (technical):
- week 1: SoHPC training week
- week 2: Get familiar with the hardware available at SURFsara and with the regression and testing framework
- week 3: Select applications and test cases, analyse characteristics of the different available architectures.
- week 4-5: Integrate application(s) in the framework.
- week 6-7: Benchmark and analyse performances.
- week 8: Write recommendations for efficient usage of the application(s) and final report.
Final Product Description:
The project will result in the integration of the selected application in the automated regression and performance testing framework, and will produce detailed benchmarks and performance profiles on different HPC systems and architectures. Through the project we will develop a better understanding of the performances of the selected application and write recommendations for efficient deployment and usage on HPC systems.
If the selected application is part of CompbioMed and if the project is successful, the results could be presented at CompBioMed organised events. If the selected application is Deltares’ D-Flow FM software, the results could be presented at a project meeting of the PRACE Preparatory Access project.
Adapting the Project: Increasing the Difficulty:
- Do more in-depth performance analysis of the selected applications.
- Port and benchmark the applications on different architectures (depending on availability).
- Write guidelines for efficient porting and usage on HPC systems.
- Select more applications.
Adapting the Project: Decreasing the Difficulty:
The goal of the project could be restricted to integrating a single application in the automated testing workflow..
HPC systems and test systems: SURFsara will grant access its HPC systems Lisa and Cartesius and to its test systems (AMD processors, ARM processors, etc.).
Other European HPC systems: if relevant, access will be granted via the above-mentioned projects (CompBioMed, EPI, PRACE 6IP WP7 PA5047)
Software: the tools and frameworks used in this project are open source. For target applications that are not open source, SURFsara’s licenses will be used. Access to applications with special access rules (e.g. Deltares’ software D-Flow FM if it is chosen as a target application) will be arranged through the relevant projects (PRACE 6IP WP7 PA5047 in this case).