Large scale accelerator enabled quantum simulator

Large scale accelerator enabled quantum simulator

Project reference: 1821

Although the topic of creating a scalable, efficient, quantum simulator has been actively pursued by multiple entities in the past years, due to the complexity of the solution required (for example multiple types of gates and formalisms) it is still very much open and interesting at least for the HPC, Physics, Chemistry and Machine Learning communities. At the moment, there are multiple solutions available, varying in quality and completeness. What might be one of the more interesting debates is the suitability of GPUs for offloading the dense kernels, considering the overarching issue of steep memory requirements resulting from the use of multiple qubits (memory usage increase grows exponentially).

In this project, we plan to investigate the validity of offloading parts of the computation on two different compute clusters, one capable of supporting RDMA and high throughput, low latency interconnect, and the other without a RDMA capability and serviced by a high throughput, high latency interconnect.

The project can benefit from multiple areas of expertise and is adjustable to a range of possible final solutions. These can go from simple shared memory compiler offloading of existing simulator code, to more efficient CUDA based implementations, to distributed RMDA aware variants.

The average is thus a focus on existing distributed simulators that don’t benefit from accelerators.

Project Mentor: Damian Podareanu

Site Co-ordinator: Zheng Meyer-Zhao

Learning Outcomes:
The student will learn more about various HPC topics like accelerator offloading, distributed programming in a heterogeneous compute cluster, and performance monitoring. Another learning outcome is related to quantum computing and the challenges posed by simulating quantum processes.

Student Prerequisites (compulsory):

  • Basic knowledge about accelerators and accelerator programming (at least basic CUDA)
  • Basic physics knowledge (in order to grasp the minimal information needed from quantum computing required)
  • Knowledge of C++/MPI

Student Prerequisites (desirable):
Some skills in being able to develop mixed code such MPI/OpenMP will be an advantage.

Training Materials:


  • Week 1: Training week
  • Week 2: Discussion about the project / fine tuning the goals and outcomes in accordance with the student / plan writing
  • Week 3 – 7: Project Development (Accelerator versions and benchmarks)
  • Week 8: Final report write-up

Adapting the Project: Increasing the Difficulty:
A natural extension (and much more difficult) is to fully extend the implementation to use both MPI and GPU offloading.

Adapting the Project: Decreasing the Difficulty
The simplest version of this project would be to make smart use of modern compiler offloading capabilities (in OpenMP 4.0+ or PGI + openACC) to simply experiment with existing simulators in a GPU enabled, shared memory environment.

We will provide access to the two computer clusters described:

  • Cartesius (true super computer – Infiniband connected)
  • LISA (cluster computer – 40G Ethernet connected)

We will provide access to the source code (baseline without accelerator capabilities)
In addition, the student will need his/her own laptop.


Tagged with: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.