Performance of Parallel Python Programs on New HPC Architectures

Project reference: 2007
Python is widely used in scientific research for tasks such as data processing, analysis and visualisation. However, it is not yet widely used for large-scale modelling and simulation on high performance computers due to its poor performance – Python is primarily designed for ease of use and flexibility, not for speed. However, there are many techniques that can be used to dramatically increase the speed of Python programs such as parallelisation using MPI, high-performance scientific libraries and fast array processing using numpy. Although there have been many studies of Python performance on Intel processors, there have been few investigations on other architectures such as AMD EPYC, ARM64 and GPUs. In 2020, EPCC will have access to all three of these architectures via: the new UK HPC National Tier-1 Supercomputer ARCHER2; our own Catalyst machine Fulhame; the Tier-2 system Cirrus. A Summer of HPC project in 2019 developed a parallel Python version of an existing C program which performs a Computational Fluid Dynamics (CFD) simulation of fluid flow in a cavity. This project involves extending that work to investigate and optimise its performance on a range of novel HPC architectures, and to extend it to make use of GPUs.

Sample output of existing program showing turbulent flow in a cavity
Project Mentor: Dr David Henty
Project Co-mentor: Dr Oliver Brown
Site Co-ordinator: Juan Herrera
Participants: Alexander Julian Pfleger, Antonios-Kyrillos Chatzimichail
Learning Outcomes:
The student will develop their knowledge of Python programming and learn how to compile and run programs on a range of leading HPC systems. They will also learn how to use GPUs for real scientific calculations.
Student Prerequisites (compulsory)
Ability to program in one of these languages: Python, C, C++ or Fortran. A willingness to learn new languages.
Student Prerequisites (desirable):
Ability to program in Python
Training Materials:
Material from EPCC’s Python for HPC course or the PRACE Python MOOC.
Workplan:
- Task 1: (1 week) – SoHPC training week
- Task 2: (2 weeks) –Understand functionality of existing parallel C and Python codes and make initial port to new HPC platforms.
- Task 3: (3 week) – Measure baseline performance on new HPC platforms and develop new GPU-enabled version
- Task 4: (2 weeks) Investigate performance optimisations and write final report
Final Product Description:
Benchmarking results for Python performance on a range of parallel machines.
Recommendations for how to improve Python performance on AMD EPYC and ARM64 processors.
Development of a GPU-enabled parallel Python application.
Adapting the Project: Increasing the Difficulty:
The project can be made harder by investigating advanced optimisation techniques such as cross-calling from Python to other compiled languages such as C, C++ or Fortran.
Adapting the Project: Decreasing the Difficulty:
The project can be made simpler by considering only one of the target platforms, or by considering CPU-only versions and omitting the GPU work.
Resources:
Access to all HPC systems can be given free of charge by EPCC.
Organisation:
EPCC
Leave a Reply