Scaling HMC on large multi-CPU and/or multi-GPGPUs architectures

Project reference: 2125

A reasonable part of the data analytics and statistical modelling community relies on the Stan language and software to implement particular models based on Bayesian approaches. This language allows users to specify stochastic models that are then translated into C++ code and, upon execution, will return the estimates for the parameters of the model (stochastic parameter fitting).
This project will focus on a more efficient  implementation of the Hamiltonian Monte Carlo that used for sampling from probability distributions in Stan with performance, parallelism and portability as goals.
To this end the method will be implemented using MPI+OpenMP or potentially using Sycl.
Our experience has shown that a reduction in complexity of the code and an introduction of hybrid parallelism can achieve significant (> ~2x) runtime reductions already for models of moderate complexity.
The successful applicant will work on a reimplementation of HMC with different sampling strategies in C++ and its integration in Stan. The project will involve recurring benchmarking and scalability analyses.
As a stretch goal a hybrid python/C++ or a purely python implementation
can be considered that will allow for an execution on the GPU.
The expected  outcome will be a highly parallel and easily understandable implementation of the method to be submitted to the Stan software package.

Project Mentor: Anton Lebedev

Project Co-mentor: Vassil Alexandrov

Site Co-ordinator: Luke Mason

Learning Outcomes:

The student  will also  acquire key skills such as:
– Familiarity with software development for academic software.
– Familiarity with concepts of Bayesian inference and its applications.
– Fundamentals of statistical physics.
The student will also learn to benchmark, profile and modify CPU and multi-GPUs code mainly written in  C++ and CUDA languages. The student will also acquire skills and be able to efficiently implement hybrid programming approaches using MPI/OpenMP.

Student Prerequisites (compulsory):
Necessary: (exclusion constraints)
– Working knowledge of C++

Student Prerequisites (desirable):
Highly desirable (any two): (necessary, but could be acquired before starting the project)
– Familiarity with fundamental concepts of stochastics (PDF, CDF, Bayes’ rule).
– Hamiltonian mechanics or statistical physics.
– MPI/OpenMP programming.

Training Materials:
These can be tailored to the student once he/she is selected.


Week 1/: Training week
Week 2/:  Literature Review Preliminary Report (Plan writing)
Week 3 – 7/: Project Development
Week8/: Final Report write-up

Final Product Description:
The final product will be an efficient HMC parallel implementation together with the corresponding internal report, convertible to a conference or better journal paper.

Adapting the Project: Increasing the Difficulty:
The project is on the appropriate cognitive level, taking into account the timeframe and the need to submit final working product and 1 reports

Adapting the Project: Decreasing the Difficulty:
The topic will be researched and the final product will be designed in full but some of the features may not be developed to ensure working product with some limited features at the end of the project.

The student will need access to a multi CPU and multi GPU machines, standard computing resources (laptop, internet connection).


Hartee Centre – STFC

Please follow and like us:
Tagged with: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.