Performance of Parallel Python Programs on New HPC Architectures

Performance of Parallel Python Programs on New HPC Architectures
Sample output of existing program showing turbulent flow in a cavity

Project reference: 2007

Python is widely used in scientific research for tasks such as data processing, analysis and visualisation. However, it is not yet widely used for large-scale modelling and simulation on high performance computers due to its poor performance – Python is primarily designed for ease of use and flexibility, not for speed. However, there are many techniques that can be used to dramatically increase the speed of Python programs such as parallelisation using MPI, high-performance scientific libraries and fast array processing using numpy. Although there have been many studies of Python performance on Intel processors, there have been few investigations on other architectures such as AMD EPYC, ARM64 and GPUs. In 2020, EPCC will have access to all three of these architectures via: the new UK HPC National Tier-1 Supercomputer ARCHER2; our own Catalyst machine Fulhame; the Tier-2 system Cirrus. A Summer of HPC project in 2019 developed a parallel Python version of an existing C program which performs a Computational Fluid Dynamics (CFD) simulation of fluid flow in a cavity. This project involves extending that work to investigate and optimise its performance on a range of novel HPC architectures, and to extend it to make use of GPUs.

Sample output of existing program showing turbulent flow in a cavity

Project Mentor: Dr David Henty

Project Co-mentor: Dr Oliver Brown

Site Co-ordinator: Juan Herrera

Participants: Alexander Julian Pfleger, Antonios-Kyrillos Chatzimichail

Learning Outcomes:
The student will develop their knowledge of Python programming and learn how to compile and run programs on a range of leading HPC systems. They will also learn how to use GPUs for real scientific calculations.

Student Prerequisites (compulsory)
Ability to program in one of these languages: Python, C, C++ or Fortran. A willingness to learn new languages.

Student Prerequisites (desirable):
Ability to program in Python

Training Materials:
Material from EPCC’s Python for HPC course or the PRACE Python MOOC.


  • Task 1: (1 week) – SoHPC training week
  • Task 2: (2 weeks) –Understand functionality of existing parallel C and Python codes and make initial port to new HPC platforms.
  • Task 3: (3 week) – Measure baseline performance on new HPC platforms and develop new GPU-enabled version
  • Task 4: (2 weeks) Investigate performance optimisations and write final report

Final Product Description:

Benchmarking results for Python performance on a range of parallel machines.
Recommendations for how to improve Python performance on AMD EPYC and ARM64 processors.
Development of a GPU-enabled parallel Python application.

Adapting the Project: Increasing the Difficulty:
The project can be made harder by investigating advanced optimisation techniques such as cross-calling from Python to other compiled languages such as C, C++ or Fortran.

Adapting the Project: Decreasing the Difficulty:
The project can be made simpler by considering only one of the target platforms, or by considering CPU-only versions and omitting the GPU work.

Access to all HPC systems can be given free of charge by EPCC.



Tagged with: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.