Project reference: 1821
Although the topic of creating a scalable, efficient, quantum simulator has been actively pursued by multiple entities in the past years, due to the complexity of the solution required (for example multiple types of gates and formalisms) it is still very much open and interesting at least for the HPC, Physics, Chemistry and Machine Learning communities. At the moment, there are multiple solutions available, varying in quality and completeness. What might be one of the more interesting debates is the suitability of GPUs for offloading the dense kernels, considering the overarching issue of steep memory requirements resulting from the use of multiple qubits (memory usage increase grows exponentially).
In this project, we plan to investigate the validity of offloading parts of the computation on two different compute clusters, one capable of supporting RDMA and high throughput, low latency interconnect, and the other without a RDMA capability and serviced by a high throughput, high latency interconnect.
The project can benefit from multiple areas of expertise and is adjustable to a range of possible final solutions. These can go from simple shared memory compiler offloading of existing simulator code, to more efficient CUDA based implementations, to distributed RMDA aware variants.
The average is thus a focus on existing distributed simulators that don’t benefit from accelerators.
Project Mentor: Damian Podareanu
Site Co-ordinator: Zheng Meyer-Zhao
The student will learn more about various HPC topics like accelerator offloading, distributed programming in a heterogeneous compute cluster, and performance monitoring. Another learning outcome is related to quantum computing and the challenges posed by simulating quantum processes.
Student Prerequisites (compulsory):
- Basic knowledge about accelerators and accelerator programming (at least basic CUDA)
- Basic physics knowledge (in order to grasp the minimal information needed from quantum computing required)
- Knowledge of C++/MPI
Student Prerequisites (desirable):
Some skills in being able to develop mixed code such MPI/OpenMP will be an advantage.
- QX simulator: http://quantum-studio.net/
- An introduction to quantum computing (2007): https://arxiv.org/pdf/0708.0261.pdf
- An Introduction to Quantum Computing, Without the Physics (2017): https://arxiv.org/pdf/1708.03684.pdf
- qHiPSTER: The Quantum High Performance Software Testing Environment: https://arxiv.org/abs/1601.07195
- Week 1: Training week
- Week 2: Discussion about the project / fine tuning the goals and outcomes in accordance with the student / plan writing
- Week 3 – 7: Project Development (Accelerator versions and benchmarks)
- Week 8: Final report write-up
Adapting the Project: Increasing the Difficulty:
A natural extension (and much more difficult) is to fully extend the implementation to use both MPI and GPU offloading.
Adapting the Project: Decreasing the Difficulty
The simplest version of this project would be to make smart use of modern compiler offloading capabilities (in OpenMP 4.0+ or PGI + openACC) to simply experiment with existing simulators in a GPU enabled, shared memory environment.
We will provide access to the two computer clusters described:
- Cartesius (true super computer – Infiniband connected)
- LISA (cluster computer – 40G Ethernet connected)
We will provide access to the source code (baseline without accelerator capabilities)
In addition, the student will need his/her own laptop.