To Py or not to Py? – Python project update

To Py or not to Py? – Python project update

Hello everybody! I hope you are all well.
As you can see in the featured image I am really excited about receiving the programme’s t-shirt!
So, in my last blog post I introduced myself and asked you to answer a question in the comments. I am pleased that you took the challenge and answered. Now, in today’s post I would like to describe some information about the project I work on and my progress so far.

Project’s motivation

Nowadays, the use of Python is getting popular, mainly because it’s really user-friendly and saves quite a lot of developing and debugging time. However, when it comes to performance, there are some lower level programming languages, which means they are closer to the computer than the human, that produce programs with less execution time. Simulations, visualisations and other heavy calculation programs that run on supercomputers reveal Python’s poor performance. So the question is; can Python be optimised to run fast and benefit from HPC architectures?
Before answering to that, we have to understand some basic features of the Python program that we will study.

The program

In this example, the cavity is a square box with an inlet on one side and an outlet on another.
In this example, the cavity is a square box with an inlet on one side and an outlet on another.

The program performs a Computational Fluid Dynamics (CFD) simulation of fluid flow in a cavity.

The Fluid Dynamics problem is a continuous system that can be described by partial differential equations, but in order for a computer to run simulations, the calculations need to be put into a grid (discretisation). In this way, the solution can be approached by finite difference method, which means that the value of each point in the grid is updated using the values of neighboring points.

The blue point of the grid is updated using the top, bottom, left and right points.
The blue point of the grid is updated using the cyan top, bottom, left and right points.

The program can be parameterized by specifying the variables below:

  • Scale Factor – affects the dimensions of the box cavity and consequently the size of the array(s) in which the grid is stored.
  • Number of Iterations – affects the number of the steps in the algorithm, the larger it is the more accurate the result will be.
  • Reynolds number (Re) – defines the viscosity which affects the presence of vertices (whirlpools) in the flow.

The simulation result is visualized by arrows and colors drawn in an image representing the grid. The arrows demonstrate the direction of the fluid at each point, while the different colors indicate the fluid’s speed, with blue being low speed and red being high speed.

Output images after simulation with Reynolds number = 0 (left) and Reynolds number = 2 (right). For a non zero Reynolds number, arrows reveal the existence of whirlpools.
Output images after simulation with Reynolds number = 0 (left) and Reynolds number = 2 (right). For a non zero Reynolds number, arrows reveal the existence of whirlpools.

Optimisation possibilities

There are several techniques that can be applied to speed up Python codes. The goal of this project is to investigate optimisations for Python programs that run not only on CPUs but also on GPUs.

Progress in the first 3 weeks

Some of my early tasks were to study the algorithm, understand the existing Python and C codes, get access to HPC systems and submit my first jobs to the supercomputers.

Currently, I am working on optimisations to the Python code. The metric that interests us is the iteration time, which is derived if we divide the total time for N iterations by N. I have been trying to use several Python modules that accelerate the calculations and found out that numexpr module is the best for our case. The Python baseline (unoptimised) code uses the Numpy module to achieve fast array calculations. However, only the numexpr version of the Python code can compete against C, since the Numexpr module creates less temporary arrays and uses multi threading internally.

Graph representing Iteration Time over the Number of Threads for the optimised numexpr version, compared to the baseline versions. Python numexpr code performs better than the serial C code when more than 2 threads are used.
Graph representing Iteration Time over the Number of Threads for the optimised numexpr version, compared to the baseline versions. Python numexpr code performs better than the serial C code when more than 2 threads are used.

Next goals and conclusion

My next goal is to produce a performance graph for the MPI versions, which I will describe next time, and then move on to developing an equivalent Python program for GPUs.
Before concluding, my question for you this time is the question in the title:

“ To Py or not to Py? ”
And by that I mean what is your experience in Python, have you ever used it? If yes, is it your most preferred language? Have you ever bothered about its performance?
Please feel free to write your thoughts on that down in the comments section.

That’s all for this blog, I really hope you found some interesting points in it and I will be happy to see you in one of my future posts.

Please follow and like us:
error
Tagged with: , , , ,
2 comments on “To Py or not to Py? – Python project update
  1. Fokion says:

    Hello Antonis!!I see your point of view about the usage of python concerning the optimization of heavy programs on supercomputers and I was really impressed by your description in the blog above though that I have not used python yet professionally.Nevertheless,I think is a big challenge and your progress has been advanced since your last post.Take care of your self and keep going like that!!!

    • Antonios-Kyrillos Chatzimichail says:

      Thank you Fokion! I’m glad that you found my approach interesting.
      I’ve recently learned Python, it is quite easy to learn, especially if you are familiar with some other programing language. You could give it a try.
      Take care too.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.