During the last 5 years, a big chunk of computations done previously on the CPU are being moved into GPUs because graphic cards are really good at working with enormous amount of data, so, why is x86 trying to improve itself by adding instructions sets like AVX-512 when that kind of computations are done now on GPUs? Isn’t there other kind of CPU architecture more suited for general purpose server CPUs? This series of blogs will give some hints about that, while explaining some interesting projects related to ARM, but…
First things first
I am Jerónimo Sánchez, a 20-years-old and a humble Computer Science student at University of Almería (UAL), Spain. I am about to start my 4th and last year studying CS here.
I am specializing on IT systems (there is not many varieties where I can choose from), although my main interests are in HPC and in working close to the hardware.
Getting personal, I am into soft sports, such as swimming, cycling, … because of a knee injury. I also really enjoy spending time with my friends playing board and video games. I especially like playing Dungeons and Dragons, but our sessions are quite scarce as this game is really a time consumer.
As a fellow Computer Scientist, I am also into solving problems. That means that I spend part of my day thinking and coding solutions to said problems. I mainly use a Raspberry Pi to automate some of my room tasks, but, since I am sort of power outlets, I do this problem-solving exercise as a puzzle.
Regarding Summer of HPC, I have found it to be the perfect way to spend this summer, as, since the start of the pandemic, my not-now-current summer internship was ended, so the timing was excellent.
Here, in Summer of HPC, I will be working on porting and benchmarking a fully ARM based cluster (project #2006). As part of this enterprise, I will be working on two sub projects:
- The first one will give some insights about the implementation of the file-system drivers of the cluster, based on benchmarks powered by IOR tool.
- The next and last subproject will try to “cheat” the MPI library so it can work correctly with SMT-X – Simultaneous Multithreading – levels where X > 1. In doing so, project #2006 can yield some valuable and helpful insights about ARM on the server space.
This cabinet is part of the Fulhame cluster, owned by EPCC (Edinburgh Parallel Computer Centre). Bellow its components:
This HPE Apollo 70-based system consists of 64 compute nodes , each with two 32-core Cavium ThunderX2 processors (ie 4096 cores in total), 128GB of memory composed of 16 DDR4 DIMMs, and Mellanox InfiniBand interconnects.Retrieved from the Fulhame website
So, why there is an ARMageddon?
Good question dear reader. ARM is a “novel” computer architecture, mostly known by the general public for powering their smartphones. Until now, ARM was dismissed as a high performance architecture mainly for the fact of being mostly designed for devices with power constrains (smartphones, embedded devices, …), but now it powers the TOP 1 supercomputer in the world: Fugaku.
To sum up:
ARMageddon is an epic battle for the dominance of the server space with three main participants: ARM, x86 and PowerPC; being at the same time an architectural design battle: RISC vs CISC, but that is for another post.
For what is of our concern, given that all the computations are being moved into GPUs and server/supercomputer CPUs are getting a controller role in the data pipeline, processors can become low powered chips as they are only sending and controlling data, and the remaining calculations that are still done on CPUs don’t need to the quickest, just good enough.
For the reasons stated above, ARM is a great candidate for CPUs on the server space, also in the consumer space, and ARM offers some other advantages:
- Custom chips can be designed. You are not tied to a Xeon CPU with features you do not need.
- You are not also tied to a supplier; your custom design can be manufactured by different companies.
- Intel Tax is skipped, although consumer Nvidia GPUs cannot be used on servers, so where is always a tax.
- Performance / Price is better.
- More energy efficient, so it can cut the electric bill.
- Related to previous. Chips generate less heat, so infrastructure to dissipate that heat is cheaper and easier to maintain.
For now, thanks for reading, and it is a pleasure to be part of the ARMageddon. Until next post dear reader.