Project reference: 1604
This project will involve optimising lattice Quantum Chromodynamics codes, which currently run on PRACE Tier-0 and other European Peta-scale supercomputers. The specific optimisation targeted involves using mixed precision, i.e. combined half, single and double precision arithmetic in the iterative solvers currently employed. The student will implement the set of extra functions required in an existing code, and run the code, check for correctness, tune the parameters of the solver and analyse the improvement in the performance.
Lattice QCD is a method to study Quantum Chromodynamics, the theory which describes how quarks bind to form the protons, neutrons and all other hadrons which make up the visible mass observed in the universe. Within lattice QCD, one simulated the complex strong interactions, one of the four fundamental forces of nature, directly from the underlying theory.
Using lattice QCD important fundamental questions can be addressed, such as how the quarks are distributed inside protons and neutrons and what fraction of their intrinsic spin, momentum and helicity is carried by the quarks and gluons. These hadron structure calculations require large computer allocations and run on the world’s largest supercomputers.
Much of the functionality for using mixed precision solvers is available in the community codes tmLQCD and QUDA, such as single precision versions of the data structures and core functions required. However, there are a number of improvements that remain to be explored and which will be the focus of this project. These improvements include optimising the solver restarts, by appropriate manipulation of the re-start vector, and combining with other optimisations, such as eigenvalue deflation. Furthermore, tuning of the solver parameters will be carried out to find the optimal set for a given problem.
Project Mentor: Dr. Giannis Koutsou
Site Co-ordinator: Stelios Erotokritou
Student: Ambra Abdullahi Hassan
The student involved in this project will:
- learn to use and contribute to two widely used community codes of the field, including GPU codes.
- Apply and expand his or her knowledge of linear algebra, first of all for understanding the solvers involved and subsequently for optimising the restarts.
- Familiarize with using Tier-0 European supercomputer resources
Student Prerequisites (compulsory):
Undergraduate degree in Physics with grade above average and good programming experience.
Excellent programming skills in C
Student Prerequisites (desirable):
- Experience in programming in MPI
- Experience in lattice QCD calculations
- Experience in CUDA programming
Weeks 1 – 2: Familiarisation with codes: tmLQCD and QUDA, first access and example runs on Piz Daint and Juqueen.
Week 3: Run existing single precision functions and deflation solvers. Analyse and plot performance gained compared to full double precision.
Week 4 – 6: Implement:
- Optimise updates of the restart vector
- Combine with exact deflation
Week 7: Measure improvements of the above optimisations and plot improvements as a function of the parameters space
Week 8: Refine results and prepare final presentation, hand-over to local researchers.
Final Product Description:
Graphics material will involve plots showing the improvements obtained by this work. A plot visualising a parameter scan of the solver parameters and the improvements gained can be used as a reference by other users of the software.
Adapting the Project: Increasing the Difficulty:
The difficulty can be increased by considering mixed precision arithmetic in the eigenvalues calculation, i.e. by interfacing to mixed precision ARPACK routines.
Alternatively, depending on the student’s interest, a small physics project may be initiated using the improvements implemented.
CaSToRC will provide the student with office space, and access to Piz Daint at CSCS, Juqueen at Juelich, and local hybrid CPU/GPU/Xeon Phi development clusters.