Part 2: Step by Step, establishing a baseline
Welcome back! Pretty cool of you to read part two, if you’re seeing this as the first post, I also encourage you to view the previous blog entry as I’ll attempt to make a post every week 🙂
This past week I got to meet my mentors and the general outline of the project was explained. We were handed 8 articles totalling 933 pages (I think I’ll skip reading a book over the summer), with more material incoming along the way. Further, we have been registered in their systems along with a provided email and applied for access to the Hartree supercomputer resources using ssh. Tiziano and I also met up over zoom to discuss the project, our approach to it as well as our past experiences in coding and the material.
So.. about the project, what on earth are we even doing?
The key to understanding any project is first and foremost to understand its headline so bear with me, the project title is “Scaling HMC on large multi-CPU and/or multi-GPGPUs architectures” and it surely is a mouthful. So let’s start by breaking down the title. Scaling in this sense just means that we can make a larger or smaller system that still functions. A system in this sense could for example be analogous to our bread baking metaphor from the previous post. Where doubling our recipe, preferably still yields us the same result as we want the final result to exhibit the same behaviour whether we’re baking a single bun or bread to feed an entire household. Like any system, everything has a boundary where other stuff tends to takes over, like what would even happen if we tried to bake an earth-sized bread. Besides the weight taking over and collapsing its airy interior, what other boundaries could you imagine? instead of going big, what would happen if we went ultra-small?
The next part “HMC” is simply Hamiltonian Monte Carlo, so that is of cause pretty self-explanatory… yeah no, it’s definitely not, but allow me to get back to that part later on as it requires its own dedicated post. For now, we can try to accept the idea as a way to pick a random sample for a model using probabilities.
So if we were to compare this to bread… I’m joking, I’ll lay off the bread for a while. So now we can imagine a factory.. that makes bread and essentially having more CPU’s is like the factory having more workers, the production or in this case computation doubles when you add another worker and overall we get to be more efficient. But like most processes’ it doesn’t really double as we encounter limitations. A process might not be able to run in parallel with other parts and we experience a slowdown. The more we can eliminate bottlenecks and streamline the process, the faster we can make the code run.
For the keen reader, you probably remember there’s another part of the title and that boils down to the architecture part. where we have two different layouts to our factory. The CPU has a specific layout/architecture internally that allows it to perform tasks. However, the General-Purpose Graphics Processing Unit (GPGPU) has another layout that in some cases also performs better than CPU’s. Implementing the GPGPU interface require some additional steps to make it work but essentially perform the same task. This sort of brings us full circle as the scaling part can be that we are limited by what a GPU can store and compute and how we can scale beyond using a single GPU. What we have to do is divide the problem up into bits where each of the bits gets its own GPU.
This weeks conclusion
Explaining concepts can be hard and I hope this approach of setting up these spur of the moment analogies helps grasp the context of the project. Moving on it will most likely get more technical and next weeks blog will be about the Hamiltonian Monte Carlo specifically. But I also encourage you to look at some of the other projects and I’ll try to highlight a few other projects and blogs for you every week.
You could for example check out Alishan and Lazaros, they’re doing a project in “Cross-Lingual Transfer Learning For Biomedical Texts” that involves using machine learning to read medical texts.