Matrix exponentiation on GPU for the deMon2k code
Project reference: 2014
The deMon2k (density of Montréal)  is a software package for density functional theory (DFT) calculations. It uses the linear combination of Gaussian-type orbital (LCGTO) approach for the self-consistent solution of the Kohn-Sham (KS) DFT equations. RT-TDDFT (Real Time – Time Dependent Density Functional) has been implemented by the Aurélien de la Lande’s group at LCP Orsay. It is based on Magnus propagation and involves an exponentiation of a general complex matrix without any special properties. In order to evaluate it, three different ways have been implemented in the current source code by Aurélien de la Lande’s group at LCP Orsay, that means the diagonalization, the Taylor expansion and the Baker-Campbell-Hausdorff scheme. Each of these methods implies linear algebra operations which spend most of the CPU resources during RT-TDDFT calculations and can be extremely time consuming for large systems. In order to assess the Magnus propagation in a parallel programming context, a strategy based on using the ScaLAPACK/MPI  library has been implemented for the Taylor expansion. Till now, this work has been done only for CPU architecture. The aim of this internship will be to write a basic prototype which implements the matrix exponentiation for GPU architecture with one of the 3 methods cited before. In order to do so, the main strategy will be to use the MAGMA library  (LAPACK implementation for GPU). During this internship, the students will work at “Maison de la Simulation”  which is one of the most important HPC institutions in France. They will be able learn some basic programming in CUDA and how to do parallel linear algebra operations within the MAGMA library. The students will get an access to the Jean Zay supercomputer located at Idris  and has more than 1000 GPUs NVIDIA Tesla v100. Jean Zay supercomputer is ont of the most powerful machine in the world, and it is ranked 46th at the top 500 in November 2019.
Project Mentor: Karim Hasnaoui
Project Co-mentor: Aurélien de la Lande
Site Co-ordinator: Karim Hasnaoui
Participants: Pablo Antonio Martínez Sánchez, Theresa Vock
The students will learn how to do parallel linear algebra on GPU through the MAGMA libraries. The student will also learn some basics in CUDA programming and how to write interfaces between Fortran and C languages. The student will also learn how to become familiar with a supercomputer environment (Linux system, job scheduler, etc…).
Student Prerequisites (compulsory)
- Basic knowledge in linear algebra
- Basic knowledge in Fortran or C language
Student Prerequisites (desirable):
- Basic knowledge in Linux environment
- Knowledge in basic editors (vi or emacs)
Basic Linux (see Section 2):
Vim editor tutorial :
The Magma library :
For accommodation, the student can directly contact me and I’ll help them to find a place for their stay.
Timeline for the completion of the project:
- Week 1 & 2: Learn how to use the magma library, basic knowledge on Linux and how to use the job scheduler at Jean Zay.
- Week 3: write the prototype of matrix exponentiation with the MAGMA library and quick benchmarking
- Week 4 & 5: Learn how to write interfaces between Fortran and C language
- Week 6: Rewrite the prototype in Fortran by using interfaces
- Week 7: Final prototype benchmarks
- Week 8: Final report
Final Product Description:
The aim of the project is to write a prototype how to calculate the matrix exponentiation on GPU architecture and to benchmark it. If it will be concluding, the prototype will be included in the deMon2k code.
Adapting the Project: Increasing the Difficulty:
Adapting the Project: Decreasing the Difficulty:
The students will get an access to the Jean Zay machine which is the new supercomputer converged platform acquired by the French Ministry of Higher Education, Research and Innovation through the intermediary of the French civil company, GENCI. Jean Zay is an HPE SGI 8600 computer which consists of two partitions. The first partition is composed of scalar nodes and the second partition of “converged” nodes, or more precisely, “converged accelerated hybrid nodes”. These hybrid nodes are equipped with both CPUs and GPUs which permit the usages associated with both HPC and AI. The second partition contains more than 1000 GPUs. In November 2019, Jean Zay machine is ranked 46th to the top 500.
Maison de la Simulation (CEA/CNRS)
Leave a Reply