Why we work with persistent memory

I just have noticed a little problem in my workplace: to be productive I need both fresh air and a calm, silent place. I have two possibilities: I can keep my door open and let the fresh air in (air conditioning is in a nearby room) or I can close it to limit the noises from outside. Both possibilities have downsides: if I close the door, the temperature in my room will rise, making it uncomfortable; if I let it open I tend to be distracted from the noises outside. It’s a lose-lose situation.

I know what you may think: “Why shall this interest us?”. My answer to that is: “Well, I don’t know, but I find interesting the fact that this kind of situation is very similar to many happening in Computer Science”. I’m talking about the fact that the ideal memory should be fast (like the RAM) and save things even after power-off (like the disk), but we can obtain only one of the two conditions, not both.

The problem is even more emphasized by the fact that data transfer to disk is usually a critical spot in HPC applications: a single data transfer can take more than 100 times more than on RAM. But applications must use the disk since it’s the only way to achieve persistence into the data. There seems to be no escape from the problem. But there is a solution.

Persistent memory (also called non-volatile memory or NVRAM) performs what its name says: it is memory (similar in performance to RAM) which can hold data even after power-off. It combines the merits of the two technologies and drops off most of the downsides. It’s also configurable to achieve the best performance depending on use: if you don’t need persistency, you can opt for Memory Mode, in which the RAM is used as a last-level cache for the slower NVRAM, achieving a transparent increase of available memory. If you are more worried about persistency, you can switch to App Direct Mode, where it is possible to obtain full control of the features of persistent memory or simply turn it into a filesystem (fsdax). Both approaches come with performance gains: App Direct Mode is faster than working with the disk and Memory Mode is better than swapping to a designed partition. We measured these gains during our tests with the NEXTGenIO cluster and we will show you more concrete numbers in future posts, after further refining and optimizations.

One of the many nodes of the NEXTGenIO cluster: dual-CPU Intel® Xeon® SP nodes of up to 56 cores, each with 192GB of conventional DRAM and 3TB of NVRAM.

Persistent memory, while being a very promising technology, isn’t very used as of today: development is still in progress to reduce costs and increase performance. As of now, only a few applications can fully exploit the functionalities it comes with: while the number is expected to grow, it can be a constraint for the development. Putting the persistent memory support inside frameworks can speed-up the integration process: this is the philosophy behind our project. As of now, we’ve managed to change Charm++ checkpointing to allow for the usage of persistent memory as a filesystem, and we will dig deeper into the code to achieve even more performance in the future weeks.

Those are the basics of persistent memory. If you are interested in this topic and want to know more about how to fully exploit its performance, take a look at this online course held at EPCC a few months ago. And if you have a solution to my door problem, please tell me because I still have found none!

Why we work with persistent memory

Participants 2022

Latest podcasts

Why we work with persistent memory

Participants 2022

Tag cloud

Latest podcasts