Project reference: 2132

Molecular surfaces have tremendous predictive power in terms of evaluating the interaction with potential binding partners, e.g. small molecules, drugs, biochemical ligands, antibodies, signaling molecules and many more. Consequently, an efficient method to produce such molecular surfaces in a robust yet general way will find many applications in contemporary science ranging from computational biology (PB, MM/PBSA, cryo-EM, Struct. Bio) to physics (Maxwell’s Eq., CFD, MD).

The current project aims to continue the efforts already inititated in last years SoHPC project 2024, where a novel approach based on Marching Tetrahedra had been developed from scratch. Particularly encouraging were the edge-based formulation, the principal availability of an efficient method to identify closed subsets of surfaces, the analytical classification using the Euler characteristics and first ports to the GPU. Given all these remarkable achievements (made by even more remarkable SoHPC students), it is only natural to pursue an extension of these activities also in this years edition of the SoHPC 2021 program. Ideally, one would take over as much as possible from the previous implementation. However, this is not a binding constraint and depending on the outcome of initial validations every aspect of the current approach may become subject to change. Anticipated ameliorations are thought in the following areas: (i) general applicability to any molecular structure defined in pdb format, (ii) enhanced robustness, (iii) proven smoothness of the molecular surface devoid of any internal artefacts (iv) significantly scaled-up size coverage. Target platform will again be the GPU, as it holds strongest promise in delivering the required performance to cope with large-scale biomolecular structures.

The proposed activitiy includes a great variety of individual tasks (which can be handed out independently in accordance with trainees’  skill level and interest) starting from simple geometric considerations and encompassing all essential steps of the development cycle characteristic of contemporary scientific software development.

Molecular surfaces of a) the spike protein in SARS-CoV-2 and b) EAFP2, an antifungal peptide. Students from last years SoHPC project 2024 have already implemented a fully functional molecular surface program for which results are shown in panel b). However, this approach needs to further be validated, made more robust, re-examined and fine-tuned and scaled up considerably to also cover large sized biomolecules such as for example the one shown in panel a) which is approximately 45x the size of b).

Project Mentor: Siegfried Hoefinger

Project Co-mentor: Markus Hickel  and  Balazs Lengyel  and  David Fischak

Site Co-ordinator: Claudia Blaas-Schenner

Participants: Miriam Beddig, Ulaş Mezin

Learning Outcomes:
Familiarity with basic development strategies in HPC environments. A broader understanding of fundamental  key algorithms in scientific computing and how to develop corresponding implementations in an efficient way.

Student Prerequisites (compulsory):
Just a positive attitude towards HPC for scientific applications and the readiness for critical and analytical thinking.

Student Prerequisites (desirable):
Familiarity with Linux, basic programming skills in C/C++/Fortran, experience with GPUs, basic understanding of formal methods and their translation into scientific applications;

Training Materials:
Public domain materials and some web information about CUDA.

Workplan:

  • Week 1: Basic HPC training; explore local HPC system;
  • Week 2: Theory and first runs with previous code;
  • Week 3: Workplan formulation;
  • Weeks 4-7: Actual validation, refinement, fine-tuning, performance assessment and upscaling;
  • Week 8: Write up a final report and submit it;

Final Product Description:
Ideally we get a robust and super-efficient molecular surface program implemented on the GPU that can address biomolecules like the spike protein of SARS-CoV-2 in less than 0.02 sec, but even more important, are summer students having gained good experience with practical work in HPC environments.

Adapting the Project: Increasing the Difficulty:
Increasing performance gains in absolute terms as well as relative to existing implementations;

Adapting the Project: Decreasing the Difficulty:
Various optional subtasks can either be dropped or carried out in greater detail.

Resources:
Basic access to the local HPC infrastructure (including various GPU architectures) will be granted. The codebase from last years SoHPC project 2024 is readily available.

Organisation:

VSC Research Center, TU Wien

Project reference: 2130

In recent years there has been a massive development in the high-performance computing architecture. More precisely, almost all the supercomputers are having the accelerators such as GPUs. On the other hand, there is a big question rising if we use the available HPC resource in scientific computing. Therefore, it is necessary to use the heterogeneous machine for scientific applications in the scientific research community. When porting the code to the new architecture, many things need to be considered, especially the performance analysis and correctness.

https://sciencebusiness.net/news/eurohpc-partnership-opens-bidding-would-be-supercomputer-hosts

Project Mentor: Dr. Ezhilmathi Krishnasamy

Project Co-mentor: Dr. Sebastien Varrette

Site Co-ordinator: Dr. Ezhilmathi Krishnasamy

Participants: Theodoros Aslanidis, Martin Stodůlka

Learning Outcomes:
* Porting scientific algorithm into GPUs.
* More understanding of GPU architecture and its limitations.
* Performance analysis and code optimization.
* Numerical analysis and its application.

Student Prerequisites (compulsory):
* Programming skills in C/C++ and CUDA programming

Student Prerequisites (desirable):
* Basic knowledge in parallel programming model or parallel computer architecture and applied mathematics.

Training Materials:
CUDA:https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html

Workplan:

Week 1/: Training week
Week 2/:  Literature Review Preliminary Report (Plan writing)
Week 3 – 7/: Project Development
Week 8/: Final Report write-up

Final Product Description:
* Porting scientific C/C++ code into GPU and check correctness.

Adapting the Project: Increasing the Difficulty:
* Using multiple GPUs might increase the difficulty of the project.

Adapting the Project: Decreasing the Difficulty:
*  CUDA implementation for the few tasks.

Resources:
* Opensource software framework will be considered.
* Student will get a desktop computer and HPC account at Iris supercomputer of the University of Luxembourg.

Organisation:

ULux-University of Luxembourg

Project reference: 2128

S-gears are a specific variant of cylindrical gears with a modified tooth profile geometry that can, in several aspects, provide superior performance to more traditionally used involute or cycloid gears.  This type of gears has in the past been employed both in metal and polymer gear applications.

A successful implementation of such type of gears in engineering applications, however, requires the gear geometry to be generated based on the underlying S-gear rack equation. To achieve this a custom algorithm will have to be developed that will enable the generation of both spur and helical type of 3D gear geometries in a generic CAD format. To achieve an optimal gear drive performance, the S-gear geometry has to be tailored to the specific volumetric limitations and the power transmission requirements of a given case study. Hence, an optimisation procedure, based on transient finite element (FE) mechanical analyses of the gear meshing process, will also have to be developed and implemented using open source and/or commercial FE software inside an available HPC cluster.

Custom developed python code for simple 2D S-ger profile curve generation; ANSYS based FEM analysis benchmark gear meshing case

Project Mentor: Dr. Borut Černe

Project Co-mentor: Dr. Damijan Zorko

Site Co-ordinator: Dr. Pavel Tomšič

Participants: Ceren Tamkoç, Bartu Yaman

Learning Outcomes:
The student will obtain in-depth knowledge in the field of FE based contact mechanics, CAD modelling, gear drive development, and development of parallel processing algorithms

Student Prerequisites (compulsory):
Basic to intermediate knowledge in the fields of CAD modelling and mechanical FEM simulations; basic to intermediate programming knowledge (Python, Fortran and/or C/C++)

Student Prerequisites (desirable):
Basic understanding of parallel processing (Open MP and/or MPI)

Training Materials:
Scientific/technical literature about S-gears, mechanical/contact FEM analyses, benchmark gear meshing FEM simulation cases in ANSYS

Workplan:

  • W1: Introduction and training week;
  • W2: S-gear CAD geometry generator
  • W3-7: transient FEM analyses and optimisation procedure
  • W8: Final report and video recording

Final Product Description:
The final results of this project are:

  • S-gear CAD geometry generator
  • FEM based gear optimisation algorithm

Simplified graphical interface for case preparation and execution (secondary requirement)

Adapting the Project: Increasing the Difficulty:
Entire procedure developed using exclusively open source software, upgrade of the model to include heat losses and temperature rise analysis (necessary especially for polymer gear analysis)

Adapting the Project: Decreasing the Difficulty:
Entire procedure developed using only commercial software

Resources:
HPC cluster at University of Ljubljana, Faculty of Mechanical Engineering, FEM software (ANSYS, Elmer FEM), CAD software and other necessary programming resources.


Organisation:

UL-University of Ljubljana

Project reference: 2127

In the current times of ever-growing data, cross-site collaboration, and evolving new data processing techniques; moving to the cloud to carry out scientific research is becoming a commodity. With that shift, however, keeping track of where your data lives turns out to be key, because exploiting data locality can yield enormous performance boosts. At the same time, tracking those data locations can be tricky, so assistance in this matter can somewhat free you from that task.

SURF offers Research Data Management (RDM) services using iRODS. iRODS provides: advanced tooling for data management, scalable data storage, high performant data transfers and a policy engine to enforce RDM policies. Moreover, within iRODS you can annotate data objects with metadata which can be used for data discovery or maintaing provenance.

We are also developing SURF Research Cloud, where making things easy for the scientist is key. In our vision, as a scientist, you select your application along with the data that you want to work on, and that will deploy a virtual lab in the cloud for you and your colleagues to work in. We call that a Workspace.

Research Cloud must benefit from iRODS functionality by querying for data based on metadata when designing your workspace, but also to store workspace metadata for provenance to enable reproducibility of scientific results.

Making data available properly for processing in a Workspace is ongoing work. Studying and implementing real life cases in different services, and compiling advice along with best practices is what this project is about.

SURF Research Cloud: a workspace’s components, including datasets

Project Mentor: Ander Astudillo

Project Co-mentor: Arthur Newton

Site Co-ordinator: Carlos Teijeiro Barjas

Participants: Maria Li López Bautista, João Quintiliano Sério Guerreiro

Learning Outcomes:
Interns will learn how to handle storage for data processing in scientific research, how to apply them to cloud computing, and how to use Research Data Management principles in practice.

All work will be done in real research environments.

Student Prerequisites (compulsory):
Linux administration, Data processing, Programming in Python

Student Prerequisites (desirable):
Good analytical skills, good communication skills, basic understanding of measuring performance, a feeling for architecture design, experience with exercises of the form “compare and contrast”, basic understanding of input/output and using different storage solutions

Training Materials:

Workplan:

Week 1: learning the problem, our initial ideas to tackle the problem and the tools and methodologies
Week 2-3: work on a first approach, coarsely implementing a simple first workflow
Week 4-6: evolve the basic concept into a scalable set-up, implementing a couple of (possibly) more complex workflows
Week 7-8: write a guide with best practices for setting up and working in the developed environment. Possibly, and depending on the intern’s interests, provide a first integration with the rest of the platform.

Final Product Description:

The project delivers:

  • Collection of reference implementations of pipelines for real life cases, using different data services
  • Catalog of best practices for real cases

Depending on time and intern’s interests, we can also include:

  • Implementation of some level of integration within SURF Research Cloud

Adapting the Project: Increasing the Difficulty:
Difficulty can be increased by tackling complexer cases, looking at more services, and combining multiple data sources.

Adapting the Project: Decreasing the Difficulty:
Difficulty can be decresead by tackling simpler cases, looking at fewer services, and focusing on single data sources.

Resources:
We host our own environment for working on this assignment. The interns will get laptops from us with access to our network. They can then develop their ideas.

Technologies we are thinking about reolve around data transfer. They include (but are not limited to): WebDAV, Ceph, CVMFS, S3, Swift, iRods, Yoda. These technologies are used by several of the services that we can test with, and are rather commonplace, with plenty of information available on the Internet, readily available to download and install in most common operating systems. The interns are free to choose and explore alternatives once they deepen into the matter.

Extra information about OpenStack, Ansible and Terraform can be helpful to understand the context that we operate in.

SURF will arrange any internal rights to data and the platform.

Organisation:

SURF B.V.

Project reference: 2126

This project consists in benchmarking computational chemistry applications in the Dutch National supercomputer. The goal of this project is to get acquainted with different relevant software for scientific simulations and be able to analyse and describe the performance of the most relevant applications that are currently run by researchers in the field. The performance results will give out a comprehensive summary of the best compilation options that can be used to achieve maximum performance for different types of simulations on CPUs and GPUs.

Project Mentor: Ana Maria de Carvalho Vicente da Cunha

Project Co-mentor: Carlos Teijeiro Barjas

Site Co-ordinator: Carlos Teijeiro Barjas

Participants: Sahin Can Alpaslan, Milana Mirkovic

Learning Outcomes:

  1. Knowledge about different types of simulations and scientific software for computational chemistry
  2. Learn how to analyse and optimize the performance for CPU/GPU applications
  3. Understand how to properly compile applications to get maximum performance
  4. Explore the performance possibilities of GPU-Offloading

Student Prerequisites (compulsory):
Basic knowledge of Linux

Student Prerequisites (desirable):
Experience with working with HPC machines. Good understanding of OpenMP, and MPI. Interest in learning on scientific simulations, particularly for computational chemistry.

Training Materials:

For a better understanding of the use GPU’s the student can look these resources:

Workplan:

During the time of the internship the student will learn how to benchmark applications in a new supercomputer without the need for specific prior knowledge.

* Week 1: getting familiar with the simulation software and input preparation for different simulations
* Weeks 2 to 4: submission of the different calculations and get the timings for both CPU and GPU, and analysis of the results as soon as they are available
* Weeks 5 to 7: thorough evaluation and polishing of results, analysing the impact of different hardware and compilation options and testing for performance.
* Week 8: deliver a report about the performance of the studied applications and best practices for different simulations.

Final Product Description:
From this project we expect to produce a white paper with the benchmark results of the latest computational chemistry applications. We also expect that this work will be helpful for future use of such applications in our systems.

Adapting the Project: Increasing the Difficulty:
The simulation algorithms used will be changed during the internship, so that the student may analyse the impact of a given type of computation and data layout in different hardware (varying the processor architecture and memory hierarchy).

Adapting the Project: Decreasing the Difficulty:
Use of the same simulation algorithm throughout the analysis, having only minimal changes in the setup.

Resources:

  • Laptop provided by SURF
  • Supercomputer account provided by SURF
  • Access to benchmarking software available at SURF: Q-Chem, NWChem, CP2K, GROMACS, Amber, ADF (AMS)

Organisation:

SURF B.V.

Project reference: 2122

Since twenty years, fast spectral methods have been developed for simulating rarefied gases. Such physical phenomenon can be modelled with Boltzmann-type equations that are written in 1D+nD+mD in the time-phase space (n = 1,2,3 and m=2,3).

We are interested in the Boltzmann-Nordheim model where the collision term is trilinear and includes quantum effects of the collisions when the temperature of the gas is low. In such case, Bose-Einstein condensate can occur, which leads to a degenerescence of the distribution function into a combination of a Dirac  delta distribution and a singular Maxwellian distribution.

The aim of this project is the numerical investigation of such phenomenon. Since the Boltzmann-Nordheim model is potentially written in 7D, the numerical methods for discretizing it must be chosen carefully. In 2012, the homogeneous 2D case has been studied by Filbet & al. thanks to fast spectral methods. They also presented some numerical investigations of the approximation of this kinetic model by its macroscopic limit. At present time, we are about to finish the extension to the 3D space homogeneous (n=0) case. With this new step, we may have the first Bose-Einstein condensate simulations through kinetic modelling.

The aim of the internship(s) is to extend the current work to non-homogeneous cases by discretizing the transport term and extending the numerical methods and codes to 1D+nD+3D with n = 1,2,3 in the phase space. For this purpose, strong skills in numerical analysis, code parallelization and high performance scientific computing are welcome to address the incoming challenges.

Project Mentor: Alexandre Mouton 

Project Co-mentor: Thomas Rey

Site Co-ordinator: Benoît Fresse

Participants: David Knapp, Artem Mavilutov

Learning Outcomes:
At the end of the internship, the student would be able to manage parallelized codes with MPI or OpenMP (or even with hybrid techniques). In addition, the student will be trained to usual numerical methods for discretizing Boltzmann-type kinetic models.

Student Prerequisites (compulsory):
Mathematics : good knowledge of PDEs, numerical methods for ODEs and PDEs, discrete Fourier transform
Computer Science : strong skills of C and Python languages, at ease with Unix systems
Scientific computing : skills in parallel computing (at least with MPI and OpenMP), data visualization

Student Prerequisites (desirable):
Skills in Cmake and with kinetic and/or fluid modelling of rarefied gases is welcome.

Training Materials:
Additional informations about the internship offer can be found on  KINEBEC webpage:
https://sites.google.com/site/moutonalexandre/research/kinebec

Workplan:

Week 1 : training sessions with all students of SoHPC.
Weeks 2, 3, 4 : development of numerical methods, getting started with the code and the HPC environment
Weeks 5, 6, 7 : implementation and numerical tests
Week 7, 8 : numerical tests, report writing
Week 8 : numerical tests, report submission, preparation for the defense

Final Product Description:
The main goal of the internship is to provide a simulation of a Bose-Einstein condensate with the non-homogeneous Boltzmann-Nordheim model at least with 1D dimensionality in physical space. Higher dimensionality in space is the final goal but requires a preliminary work in numerical analysis before any implementation.

Adapting the Project: Increasing the Difficulty:
If the 1D+3D model is successfully discretized and investigated, we may think about the extension to 2D+3D model.

Adapting the Project: Decreasing the Difficulty:
If some difficulties remain, the student can work on alternative parallelization methods for discretizing the collision operator. At present time, MPI techniques are used so developing alternative version of these routines by using OpenMP or Cuda is also interesting.

Resources:
Laboratoire Paul Painlevé owns 3 nodes dedicated to developers in scientific computing. The Computing Central Service also provide larger HPC ressources for any lab members who works in Lille University research units. These accesses are for free and will be activated at the beginning of the internship.

Organisation:
LPP-Laboratoire Paul Painlevé (Université de Lille & CNRS)

Project reference: 2123

Spacecraft missions and infrared measurements allowed to know that the surface of most asteroids is covered by a layer of unconsolidated granular material called the Regolith. A regolith is defined as a layer or mantle of loose, incoherent, rocky material of whatever origin, that nearly forms the surface of the land and rest on coherent bedrock. Such regolith can be observed in the Moon, Venus, Mars and its satellites, but also in smaller satellites and asteroids. It can be said that regoliths are widespread and it is therefore important to study this material. Furthermore the study of regolith is at the intersection between planetary sciences, space exploration and materials science.

This project aims at studying the thermal response of a regolith. Understanding this thermal response is key since most information about small object of our solar system comes from remote measurements of this Regolith material. In this project we will study the thermal cycles induced by radiative heating on these geometrically complex Regolith material. In particular, the mentor of the project is part of the scientific team of the OSIRIS-REx (space mission from NASA) that is currently orbiting the asteroid Bennu. Among other scientific tasks, this mission will send back to Earth a sample of the Regolith of Bennu that has successfully been recovered last October. Thus the numerical framework developed in this work will be applied to study the data obtained in the OSIRIS-REx mission.

From a computational perspective, the objective is to model complex radiative heat exchanges between arrangements of particles which are representative of a real Regolith. To this end, a finite element mesh of a particle arrangement can be used to compute the visibility between all the particles’ surfaces. Once the view factors between pairs of surface elements are computed, they can be used to assemble and solve the thermal problem. The main difference compared to a classic finite element solver is that the global stiffness matrix contains radiative exchange terms relating nodes which do not have any element in common. This is challenging for a distributed memory parallel implementation because the stencil of the discretization for surface nodes is quite large and the sparsity of the finite element matrix is reduced.

In this context, the main task that will be carried out in this internship project is to implement a parallel finite element solver to simulate radiative heat exchanges within a Regolith. Existing in-house finite element routines will be used to compute the terms of the nonlinear partial differential equations, which will be solved using the PETSc suite.

We seek for candidates with a strong background on computational mechanics using distributed memory numerical approaches.

A representative volume element (RVE) of the Regolith is shown in the left. This RVE has the average thermal behavior of a region on the surface of the asteroid Bennu (shown in the right), target of the OSIRIS-REx mission.

Project Mentor: Daniel Pino Muños

Project Co-mentor: Modesar Shakoor

Site Co-ordinator: Karim Hasnaoui

Participants: Cormac McKinstry, Venkata Mukund Kashyap Yedunuthala

Learning Outcomes:
The student will discover the huge amount of research that surrounds a space exploration mission such as OSIRIS-REx. From the technical point of view the student will acquire a strong experience on distributed memory parallelism for solving nonlinear convection-diffusion equations with the finite element method.

Student Prerequisites (compulsory):

  • Scientific computing
  • C programming
  • Distributed memory programming using MPI

Student Prerequisites (desirable):

  • PETSc suite
  • Finite element method

Training Materials:
PETSc tutorials, https://www.mcs.anl.gov/petsc/documentation/tutorials/index.html

Workplan:

Week 2: bibliographic study of the physical problem that will be solved and existing scientific computing libraries and tools
Week 3: PETSc training using online tutorials and plan for the parallel implementation of the nonlinear radiative heat transfer equation using PETSc
Week 4-5: parallel implementation of the nonlinear finite element solver using PETSc
Week 6: optimization of the parallel implementation for large numbers of finite elements and CPUs (weak and strong scalability)
Week 7-8: testing, validation for small and large Regoliths and final report

Final Product Description:
A parallel finite element solver that allows to simulate radiative heat exchanges on a geometry representative of a real Regolith.

Adapting the Project: Increasing the Difficulty:
The existing ray tracing code for computing the view factors used in the nonlinear heat transfer equation, which has a very poor scalability, could be re-designed with a more optimal parallel implementation.

Adapting the Project: Decreasing the Difficulty:
Some aspects relevant to weak and/or strong scalability could be disregarded, for instance the assembly and solution of the equations should be done in parallel but not necessarily the input of the finite element mesh and the view factors, and the output of the temperature, etc

Resources:
The students have access to the computational cluster available at CEMEF/Mines ParisTech (60.9 TFlops) in order to carry out the development. Additionnaly, access to the Jean Zay cluster at Idris will also be provided (50000 CPU Hours).

Organisation:

MdlS-Maison de la Simulation (CEA/CNRS)

Project reference: 2119

The paradigm of quantum computers is completely different from the paradigm of classical digital computers. Understanding this paradigm will allow the student to understand, for example, how it is possible that quantum computers are able to solve the problem of exponential complexity in less than exponential time.

Currently, there are rapid developments in this field of quantum computers. The aim of this project is not to deal with all the news in this area, but to give a theoretical basis that will allow the student to understand the potential of quantum computers. For example, the difference between quantum natural parallelism and parallel programming on HPC will be shown.

Thanks to this knowledge, the student will be able not only to understand existing basic quantum algorithms such as Grover’s, Shor’s, Deutsch-Jozsa, estimating Gauss sums, element distinctness problem or quantum algorithm for solving linear systems of equations, but also to optimize their implementation or create their modifications.

Implementation of 8-qbit Grover algorithm. At the top of the image is the oracle function and at the bottom is the Grover diffusion operator. The answer that is sought is code 00101010 (42). This quantum circuit represents only 1 iteration of this algorithm. To achieve the maximum probability (in this case 0.999947) for the correct answer, 12 of these iterations are needed.

Project Mentor: Jiří Tomčala

Project Co-mentor: 

Site Co-ordinator: Karina Pešatová

Participants: Lucia Absalom Bautista, Spyridon-Andreas Siskos

Learning Outcomes:
Understanding the paradigm of quantum computers and their programming. Ability to design a quantum circuit, enabling the solution of problems with exponential complexity in a shorter time.

Student Prerequisites (compulsory):
Knowledge of Linux as the development environment.

Student Prerequisites (desirable):
Knowledge of linear algebra. Experience in Python programming.

Training Materials:
“Quantum Computation and Quantum Information” by Isaac Chuang and Michael Nielsen.
“Quantum Algorithm Implementations for Beginners” by various authors, which can be downloaded from here: https://arxiv.org/pdf/1804.03719.pdf
Etc.

Workplan:

1st week: quantum bits
2nd week: single qubit operations
3rd week: quantum gates
4th week: Grover’s algorithm
5th week: element distinctness problem quantum algorithm
6th week: estimating Gauss sums quantum algorithm
7th week: Shor’s algorithm
8th week: solving linear systems of equations quantum algorithm

Final Product Description:
Running the student’s own quantum programs (circuits) on a simulator and then on a real quantum computer.

Adapting the Project: Increasing the Difficulty:
Attempting to create interesting modifications of the discussed algorithms.

Adapting the Project: Decreasing the Difficulty:
Omitting algorithms that are too difficult to understand.

Resources:

Organisation:

IT4Innovations National Supercomputing Center at VSB – Technical University of Ostrava

Project reference: 2118

Quantum computing is one of the scientific hot-topics nowadays with the potential to vastly improve our computation capacity in certain areas due to its use of quantum-bits offering superpositions and much faster computation, subsequently. Thus, it seems very promising for many possible applications in other areas like cryptography, artificial intelligence, or numerical methods.

Quantum chemistry is one of its most prominent areas of application, with some algorithms able to be used in solving real-world problems instead of just overly simplified ones. These tend to utilize the latest, state-of-the-art, quantum-adapted mathematical approaches, like numerical derivative approximation, bringing a new optimization aspect into them – not only runtime but also the number of qubits necessary. Thus, quantum chemistry provides an excellent opportunity to start with quantum computing, connecting an already-investigated field with results, while offering a lot of straightforward, obvious directions for improvement and connecting multiple skills, most notably mathematics, quantum mechanics, and programming.

In this project, we aim to implement Variational Quantum Eigensolver (VQE) to obtain potential energy surfaces in an on-the-fly manner for molecular dynamics simulations of two- and three-atomic systems. These results will be further compared with classical methods like MCSCF and MRCI, using these as a benchmark for assessment of  VQE approximation ability, depending on adopted approximations, chosen trial functions, and other algorithm modifications.

The implementation will be performed utilizing IBM’s Qiskit package for Python, allowing developers to access real quantum machines and further use a lot of pre-implemented functionalities, so that the algorithm will be the main focus, instead of too much supplementary work.  Considering, that different types of problems can occur while computing on real machines, simulators will be also adopted to compare the results with the real machine and further assess the influence of different levels of noise and ways of its mitigation.

Snapshot of dynamical simulation of N2 + He system

Potential energy curves of N2^+ system

Graphical visualisation of a quantum circuit

Project Mentor: Martin Beseda

Project Co-mentor: Stanislav Paláček

Site Co-ordinator: Karina Pešatová

Participants: Carola Ciarameletti, Jenay Patel

Learning Outcomes:
Student will learn the basics of programming in Python and obtain some knowledge of quantum chemistry. The main point will be quantum computing, where a student should have an overview about the current state-of-the-art in the field and hands-on experience with some of the methods.

Student Prerequisites (compulsory):
Students need to have a basic knowledge of programming and mathematics, most notably linear algebra.

Student Prerequisites (desirable):
It’s desirable to have a previous experience with programming in Python, as well as understanding of basics of quantum mechanics.

Training Materials:

https://qiskit.org/textbook/preface.html – Qiskit textbook, very convenient for self-learners
https://quantum-computing.ibm.com/ – IBM tools for learning of quantum computing, including a graphical editor

Workplan:

The 1st week is planned to be in a tutorial-like way, with frequent, everyday talks with students, explanation of basic concepts and installation of Anaconda package and Qiskit.
The 2nd and 3rd week are devoted to the implementation of basic functionalities, like gradient approximation and subsequent optimization method together with preparation of first report.
The main part of implementation is supposed to be performed from 4th to 6th week as the working simulation should be done by this date.
The 7th week will comprise of minor improvements and comparisons of different techniques for approximation (gradients…) and noise mitigation.
The final report and presentation will take place during 8th week, together with explanations of more advanced concepts and possible continuation of self-study in this field, after the end of SoHPC21.
In the case of two students, the implementation part will be larger, as they will be supposed to participate also in other parts of the code outside of Qiskit and to implement the code in more ways to compare them in the end.

Final Product Description:
The main result of the project will be a Python (Qiskit) implementation of VQE method, joined with our code for molecular dynamics simulations.

Adapting the Project: Increasing the Difficulty:
The difficulty can be easily increased by adopting more advanced techniques of gradient approximation and necessary-qubit-number reduction, which need much larger effort to comprehend.

Adapting the Project: Decreasing the Difficulty:
The main point of the project lies in implementation of VQE method itself, i.e. in variational optimization of a specific functional and subsequent determination of eigenvalues of Hamiltonian operator. That said, only this part can be implemented without connection to dynamical simulations, as obtained energies themselves will be a sufficient result to demonstrate strengths of this approach.

Resources:
All the equipment will be personal laptops and an access to IBM’s quantum computer. While we assume, that students are already equipped with personal computers, IBM offers free access to some of its machines, so we’ll apply for that in the beginning of the SoHPC.

Organisation:

IT4Innovations National Supercomputing Center at VSB – Technical University of Ostrava

Project reference: 2121

Today’s supercomputing hardware provides a tremendous amount of floating point operations (FLOPs). However,  most FLOPs can only be harvested easily, if the algorithm does exhibit lots of parallelism. Additionally, efficient use of resources strongly depends on the underlying tasking framework and its mechanisms to distribute work to the cores.

In this project we turn our efforts towards a our tasking framework “Eventify”. We will investigate the performance and bottlenecks of our  taskified Fast Multipole Method (FMM) as well as some easier microbenchmarks.

Depending on your scientific background we will pursue different goals.

First, Eventify can be utilized to implement a set of microbenchmarks to compare against other parallelization approaches and paradigms like OpenMP or HPX.

Second, Eventify can be benchmarked in a full FMM run and compared against other parallelization approaches and paradigms like OpenMP or HPX.

The challenge of both assignments is to execute/schedule tiny to medium-size compute kernels (tasks) without large overheads within the tasking frameworks. We will utilize algorithm knowledge in order to speed things up and circumvent synchronization bottlenecks.

What is the fast multipole method? The FMM is a Coulomb solver and allows to compute long-range forces arising in molecular dynamics, plasma or astrophysics. A straightforward approach is limited to small particle numbers N due to the O(N^2) scaling. Fast summation methods such as PME, multigrid or the FMM are capable of reducing the algorithmic complexity to O(N log N) or even O(N).  However, each fast summation method has auxiliary parameters, data structures and memory requirements which need to be provided. The layout and implementation of such algorithms on modern hardware strongly depends on the available features of the underlying architecture.

Assumed workplace of a 2021 PRACE student at JSC

Project Mentor: Ivo Kabadshow

Project Co-mentor: Mateusz Zych

Site Co-ordinator: Ivo Kabadshow

Participants: Arthur Guillec, Tristan Michel

Learning Outcomes:
The student will familiarize himself with current state-of-the art HPC harware. He/she will learn how parallelization should be performed on a low level and use this knowledge to utilize/benchmark/extend our tasking framework in a modern C++ code-base. He/she will use state-of-the art benchmarking/profiling tools to test and improve performance for the tasking framework and its compute kernels which are time-critical in the application. 

Special emphasis will be placed on different approaches each tasking framework provides. The student will learn how additional knowledge from the algorithmic workflow and dependencies can influence the parallel performance to improve.

Student Prerequisites (compulsory):
Prerequisites

  • Programming knowledge for at least 5 years in C++
  • Basic understanding of template metaprogramming
  • “Extra-mile” mentality

Student Prerequisites (desirable):

  • C++ template metaprogramming
  • Interest in C++11/14/17 features
  • Interest in low-level performance optimizations
  • Ideally student of computer science, mathematics, but not required
  • Basic knowledge on benchmarking, numerical methods
  • Mild coffee addiction
  • Basic knowledge of git, LaTeX, TikZ

Training Materials:
Just send an email … training material strongly depends on your personal level of knowledge. We can provide early access to the HPC cluster as well as technical reports from former students on the topic. If you feel unsure about the requirements, but do like the project, send an email to the mentor and ask for a small programming exercise.

Workplan:

Week – Work package

  1. Training and introduction to Eventify and HPC hardware
  2. Development of a small microbenchmark in OpenMP and HPX
  3. Development of a small microbenchmark in Eventify
  4. Comparison of the different tasking frameworks
  5. Extending parts of the FMM with HPX and/or OpenMP
  6. Benchmarking of the HPX/OpenMP FMM
  7. Optimization and benchmarking, documentation
  8. Generating of final performance results. Preparation of plots/figures. Submission of results.

Final Product Description:
The final result will be a good understanding of today’s parallelization possibilities in HPC. The benchmarking results, especially the gain in performance can be easily illustrated in appropriate figures, as is routinely done by PRACE and HPC vendors. Such plots could be used by PRACE.

Adapting the Project: Increasing the Difficulty:
The tasking framework can express different levels of parallelism. A particularly able student may also benchmark more parts of the algorithm or implement more complex algorithms.

Adapting the Project: Decreasing the Difficulty:
As explained above, a student that finds the task of adapting/optimizing the tasking framework too challenging, could very well restrict himself to a simpler model or partial set of FMM.

Resources:
The student will get access (and computation time) on the required HPC resources for the project. A range of performance and benchmarking tools are available on site and can be used within the project. No further resources are required. Hint: We do have experts on all advanced topics, e.g. C++11/14/17, in house. Hence, the student will be supported when battling with ‘bleeding-edge’ technology.

Organisation:

JSC-Jülich Supercomputing Centre

Project reference: 2120

Simulations of classical or quantum field theories often rely on a lattice discretized version of the underlying theory. For example, simulations of Lattice Quantum Chromodynamics (QCD, the theory of quarks and gluons) are used to study properties of strongly interacting matter and can, e.g., be used to calculate properties of the quark-gluon plasma, a phase of matter that existed a few milliseconds after the Big Bang (at temperatures larger than a trillion degrees Celsius). Such simulations take up a large fraction of the available supercomputing resources worldwide.

Other theories have a lattice structure already “build in”, as is the case for graphene, with its famous honeycomb structure. Simulations studying this material can build on the experience gathered in Lattice QCD. These simulations require, e.g., the repeated computation of solutions of extremely sparse linear systems and update their degrees of freedom using symplectic integrators.

Depending on personal preference, the student can decide to work on graphene or on Lattice QCD. He/she will be involved in tuning and scaling the most critical parts of a specific method, or attempt to optimize for a specific architecture in the algorithm space.

In the former case, the student can select among different target architectures, ranging from Intel XeonPhi (KNL), Intel Xeon, AMD EPYC or GPUs (NVIDIA A100), which are available in different installations at the institute. To that end, he/she will benchmark the method and identify the relevant kernels. He/she will analyse the performance of the kernels, identify performance bottlenecks, and develop strategies to solve these – if possible taking similarities between the target architectures (such as SIMD vectors) into account. He/she will optimize the kernels and document the steps taken in the optimization as well as the performance results achieved.

In the latter case, the student will, after getting familiar with the architectures, explore different methods by either implementing them or using those that have already been implemented. He/she will explore how the algorithmic properties match the hardware capabilities. He/she will test the archived total performance, and study bottlenecks e.g. using profiling tools. He/she will then test the method at different scales and document the findings.

In any case, the student is embedded in an extended infrastructure of hardware, computing, and benchmarking experts at the institute.

QCD & HPC

Project Mentor: Dr. Stefan Krieg

Project Co-mentor: Dr. Eric Gregory

Site Co-ordinator: Ivo Kabadshow

Participants: Thomas Marin, Marc Túnica Rosich

Learning Outcomes:
The student will familiarize himself with important new HPC architectures, such as Intel Xeon, NVIDIA or other accelerated architectures. He/she will learn how the hardware functions on a low level and use this knowledge to devise optimal software and algorithms. He/she will use state-of-the art benchmarking tools to achieve optimal performance.

Student Prerequisites (compulsory):

  • Programming experience in C/C++

Student Prerequisites (desirable):

  • Knowledge of computer architectures
  • Basic knowledge on numerical methods
  • Basic knowledge on benchmarking

Training Materials:

Supercomputers @ JSC
http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/supercomputers_node.html

Architectures
https://developer.nvidia.com/cuda-zone
http://www.openacc.org/content/education

Paper on MG with introduction to LQCD from the mathematician’s point of view:
http://arxiv.org/abs/1303.1377

Introductory text for LQCD:
http://arxiv.org/abs/hep-lat/0012005
http://arxiv.org/abs/hep-ph/0205181

Introduction to simulations of graphene:
https://arxiv.org/abs/1403.3620
https://arxiv.org/abs/1511.04918

Workplan:

Week – Work package

  1. Training and introduction
  2. Introduction to architectures
  3. Introductory problems
  4. Introduction to methods
  5. Optimization and benchmarking, documentation
  6. Optimization and benchmarking, documentation
  7. Optimization and benchmarking, documentation
  8. Generation of final performance results. Preparation of plots/figures. Submission of results.

Final Product Description:
The end product will be a student educated in the basics of HPC, optimized methods/algortithms or HPC software.

Adapting the Project: Increasing the Difficulty:
The student can choose to work on a more complicated algorithm or aim to optimize a kernel using more low level (“down to the metal”) techniques.

Adapting the Project: Decreasing the Difficulty:
Should a student that finds the task of optimizing a complex kernel too challenging, could restrict himself to simple or toy kernels, in order to have a learning experience. Alternatively, if the student finds a particular method too complex for the time available, a less involved algorithm can be selected.

Resources:
The student will have his own desk in an open-plan office (12 desks in total) or in a separate office (2-3 desks in total), will get access (and computation time) on the required HPC hardware for the project and have his own workplace with fully equipped workstation for the time of the program. A range of performance and benchmarking tools are available on site and can be used within the project. No further resources are required.

Organisation:

JSC-Jülich Supercomputing Centre

Project reference: 2115

Python is widely used in scientific research for tasks such as data processing, analysis and visualisation. However, it is not yet widely used for large-scale modelling and simulation on high performance computers due to its poor performance – Python is primarily designed for ease of use and flexibility, not for speed. However, there are many techniques that can be used to dramatically increase the speed of Python programs such as parallelisation using MPI, high-performance scientific libraries and fast array processing using numpy. Although there have been many studies of Python performance on Intel processors, there have been few investigations on other architectures such as AMD EPYC and GPUs. In 2021, EPCC will have access to these architectures via the new UK HPC National Tier-1 Supercomputer ARCHER2 and the Tier-2 system Cirrus.

A Summer of HPC project in 2020 developed optimised parallel Python version of an existing C program which performs a Computational Fluid Dynamics (CFD) simulation of fluid flow in a cavity, including an initial GPU implementation. Studies were done on the previous UK National HPC system, ARCHER, and on Cirrus.

The UK National HPC service ARCHER2 has recently been launched which is a Cray SHASTA system with 750,000 CPU-cores, so we are very interested on its performance characteristics. This year’s project will involve investigating performance on ARCHER2 which has very different processing nodes (128-core AMD) from the Intel systems studied previously. There is also the option of continuing the GPU investigation on Cirrus.

Sample output of existing program showing turbulent flow in a cavity

Project Mentor: Dr. David Henty

Project Co-mentor: Dr. Mario Antonioletti

Site Co-ordinator: Catherine Inglis

Participants: Alejandro Dinkelberg, Jiahua Zhao

Learning Outcomes:
The students will develop their knowledge of Python programming and learn how to compile and run programs on a range of leading HPC systems. They will also learn how to use GPUs for real scientific calculations.

Student Prerequisites (compulsory):
Ability to program in one of these languages: Python, C, C++ or Fortran. A willingness to learn new languages.

Student Prerequisites (desirable):
Ability to program in Python.

Training Materials:
Material from EPCC’s Python for HPC course or the PRACE Python MOOC.

Workplan:

Task 1: (1 week) – SoHPC training week

Task 2: (2 weeks) –Understand functionality of existing parallel C and Python codes and make initial port to new HPC platform.

Task 3: (3 week) – Measure baseline performance on new HPC platforms.

Task 4: (2 weeks) Investigate performance optimisations and write final report

Final Product Description:
Benchmarking results for Python performance on a range of parallel machines;

Recommendations for how to improve Python performance on AMD EPYC processors.

Optimisation of a GPU-enabled parallel Python application.

Adapting the Project: Increasing the Difficulty:
The project can be made harder by investigating advanced optimisation techniques such as cross-calling from Python to other compiled languages such as C, C++ or Fortran.

Adapting the Project: Decreasing the Difficulty:
The project can be made simpler by considering only one of the target platforms, or by considering CPU-only versions and omitting the GPU work.

Resources:

Access to all HPC systems can be given free of charge by EPCC

Organisation:

EPCC

EPCC

Project reference: 2114

Recent technological advances allow us to assess millions of genetic data points (>30 million) to identify their involvement in disease. This is a two orders of magnitude increase compared to the data available a few years ago. In biology, pathway analysis aims to find genes that work together to improve our understanding of the genetic basis of diseases This project will focus on re-engineering genomicper (https://CRAN.R-project.org/package=genomicper), a pathway analysis piece of software, to allow efficient analyses at the scale we need today and futureproof it as genomic data grows further. Genomicper is an R package, currently available from the CRAN repository.

This project aims to re-engineer genomicper so it can analyse bigger and existing data sets quickly. The application of the algorithm to large scale data sets is hindered by:

  1. the fact that the software is currently based on a collection of functions written in R and
  2. it performs thousands of independent permutations on the dataset in a sequential manner.

Thus, the objectives of this project are:

Understand the circular permutation algorithm (https://pubmed.ncbi.nlm.nih.gov/22973544/) that underlies genomicper analysis

  1. Profile and baseline the code performance to identify bottlenecks and opportunities for software improvement in performance and functionality
  2. Re-write the base algorithm in C/C++ and embed the code in R
  3. Benchmark and test the performance of any new algorithm with varying input sizes (e.g. 10/20 /30 million data points)

Any resulting code improvements should be contributed back to the existing CRAN R package.

Factors related to particular types of disease (from DOI: 10.1534/g3.112.002618).

Project Mentor: Dr. Mario Antonioletti

Project Co-mentor: Dr. Pau Navarro

Site Co-ordinator: Catherine Inglis

Participants: İrem Okur, Aybüke Özçelik

Learning Outcomes:

  1. Learn and implement the process of optimising a real-world piece of code.
  2. Learn about tool sets that can be used to achieve 1.

Student Prerequisites (compulsory):
The student should have skills in C or C++.

Student Prerequisites (desirable):
Knowledge of parallel techniques. Some R.

Training Materials:
Useful links:

https://rstudio.github.io/profvis/
https://cran.r-project.org/web/views/HighPerformanceComputing.html
https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html

Workplan:
The work will basically consist with slight variants of the following cycle: baseline code performance for different data sizes and construct tests for correctness; profile code; plan changes; apply optimisation; go back to baselines and tests.

Final Product Description:
An R package that can process existing problems faster and be able to tackle bigger problems than is currently possible.

Adapting the Project: Increasing the Difficulty:
Can restrict to understanding the performance bottlenecks for a weaker student and suggesting improvements while a stronger student can implement actual improvements such as converting R to C/C++.

Adapting the Project: Decreasing the Difficulty:
As above – a weaker student can baseline the code, understand what changes would need to be made and possibly implement some of the simpler cases. The same process could still be carried but the level of change implementation would be more restricted

Resources:
All the codes and required tools are open source and/or are freely available. For bigger problems access to EPCC systems can be given, e.g. Cirrus should suffice. Suitable data sets  to do benchmarking/profiling will be given.

Organisation:

EPCC

EPCC

Project reference: 2113

Exploitation of renewable energy sources is critical for humanity to timely address issues raised by climate change and for transitioning towards a sustainable zero-carbon economy. MPAS Atmosphere model, developed by the US National Center for Atmospheric Research, can be used to model the atmosphere given a set of initial and boundary conditions both globally and for a particular geographic region of interest. One important use case is estimating the available resources, e.g. to assess suitability of a specific region for wind/solar farm deployment.
MPAS is designed with parallelism in mind and the aim of this project is to assess the scalability and performance of MPAS on ARCHER2, the new UK National Supercomputer. To this end, a set of experiments will be devised by the participants and performed using increasingly finer input meshes for different geographical regions of interest and using increasing number of processes. Additionally process placement is likely to influence performance due to Non-Uniform Memory Access (NUMA) architecture of the system, which would also be investigated. Ultimately the project aims to explore the scalability boundaries for MPAS and the results would identify the main factors that limit potential scaling.
The results will be communicated via project report and a blog post.

Model for Prediction Across Scales (MPAS)

Project Mentor: Dr. Evgenij Belikov

Project Co-mentor: Dr. Mario Antonioletti

Site Co-ordinator: Catherine Inglis

Participants: Jonas Alexander Eschenfelder, Carla Nicolin Schoder

Learning Outcomes:
The student(s) will gain hands-on experience with running MPAS simulations, gain proficiency in using a HPC system to assess scalability and performance of a production code, including the use of profiling tools to identify potential bottlenecks and improve their communication skills by visualising the results and writing up a report as well as summarising the results in a blog post. This project is also a great opportunity to establish contacts with  Earth System Science researchers and HPC experts at the University of Edinburgh, Edinburgh Centre for Carbon Innovation, and EPCC.

Student Prerequisites (compulsory):

  • familiarity with one programming language
  • working knowledge in a Linux environment
  • commitment to complete the ARCHER2 Driving Test before the start of the project

Student Prerequisites (desirable):

  • proficiency in Fortran/C programming
  • proficiency in bash/Python scripting
  • knowledge of  parallel programming with MPI
  • knowledge of HPC systems and architectures
  • familiarity with profiling and debugging tools
  • domain knowledge in Earth System Science, Renewable Energy and/or Atmosphere Modelling
  • experience in visualisation

Training Materials:

Workplan:
Project timeline by week:

  1. Training week
  2. Planning and experimental design
  3. Test runs and project plan submission
  4. Initial runs (small and medium inputs)
  5. Further runs (large inputs)
  6. Further runs (contd) and post-processing
  7. Visualisation and report draft
  8. Report completion and submission

The workplan is adapted to multiple students during week two by splitting the experiments to be performed accordingly (see below). In particular, we may look into comparing two different HPC systems, or looking at different compilers and their settings, or using meshes centered at different geographical regions.

Final Product Description:
The results will provide insight into MPAS scalability and performance on ARCHER2 which is interesting due to a deep NUMA hierarchy (using over 10k cores). Depending on progress, the simulation outcomes may be usful for predicting the suitability of a chosen region for wind or solar farm deployment.

Adapting the Project: Increasing the Difficulty:
To increase the difficulty, more in-depth experiments can be performed including varying compilers and compiler flags (e.g. vectorisation), using additional meshes (centered around different regions) and profiling using hardware performance counters, as well as investingating experimental GPU support.

Adapting the Project: Decreasing the Difficulty:
To decrease the difficulty, some of the existing model setups could be re-used. Additionally the number of experiments could be limited e.g. to a single compiler and HPC system, using a single mesh and/or focusing only on one geographical region.

Resources:

  • laptop and stable internet connection
  • HPC cluster access and budget

Organisation:

EPCC

EPCC

Project reference: 2112

Datacentre in production are characterized by the combination of extreme parallelism and performance demand. This translates in large number of resources to be management constantly stressed by intense computational patterns.

Each of the HW and SW resources reports metrics underlying its status and usage. This accounts for terabytes of data produced daily which can be leveraged for automating the management of the entire datacentre. Big-data, artificial intelligence and visualization technologies can be applied to deploy datacentre automation solutions.

With this prospect CINECA and UNIBO have since years been pioneering in deploying such solutions to the CINECA datacentre. The goal of this project is to extend the current solution with 3D visualization and predictive models to develop a full digital twin of the datacentre.

In the framework of SoHPC program, we aim to integrate a 3D visualization of the datacentre with the live data collection as well as with forecasting of future critical behaviours.  Student will gain knowledge about data collection, ML analytics and data visualization.

Project Mentor: Andrea Bartolini

Project Co-mentor: Martin Molan

Site Co-ordinator: Massimiliano Guarrasi

Participants: David Mulero Pérez, Sepideh Shamsizadeh

Learning Outcomes:
Increase student’s skills about:

  • Big Data Analysis
  • Data visualization
  • Python
  • Open Stack VM
  • Blender
  • HPC environments
  • Deep learning (depending on the difficulty)
  • TensorFlow (depending on the difficulty)

HPC infrastructures

Student Prerequisites (compulsory):
Python

Student Prerequisites (desirable):

  • At least a basic knowledge of at least one of the following tools: TensorFlow, Pytorch, Keras (TensorFlow/Keras preferred)
  • Pandas
  • Spark
  • Blender
  • 3D modelling

Training Materials:
None

Workplan:

Week 1: Common Training session
Week 2: Introduction to CINECA systems, small tutorials on big data, data collection systems and 3D visualization system and detailed work planning. Depending on the difficulty (and the interests of the student) introduction to ML data analytics solutions.
Week 3: Problem analysis and deliver final Workplan at the end of week.
Week 4, 5: Production phase:
Increased difficulty:

  • Benchmarking of existing ML data analytics solutions
  • Implementation of outputs from best performing ML solutions into 3D visualization software

Baseline project:

  • Importing data from data collection framework into 3D visualization solution

Week 6, 7: Final stage of production phase. Implementing feedback form domain experts and end users (CINECA staff).
Week 8: Finishing the final movie. Write the final Report.

 Final Product Description:
The creation of a digital twin of the datacentre.

Adapting the Project: Increasing the Difficulty:
Different ML solutions for operational data analytics (anomaly detection, downtime prediction) will be benchmarked and evaluated. The best performing solutions will be included in the final visualization.

Adapting the Project: Decreasing the Difficulty:
Basic parameters from the system will be integrated in the existing visualization software.

Resources:
The student will have access to our facility, our HPC systems and databases containing all the measurements, system logs and node status information. They could also manage a dedicated virtual machine.

Organisation:
CINECA – Consorzio Interuniversitario

Project reference: 2111

Automatic techniques for recognition of submarine structures (e.g. canyons) are an absolute novelty in the Oceanographic field. Excluding some preliminary experiments (Ismail et al -2015, Huvenne -2002), right now the recognition on the seabed of structures like canyons, mud volcanoes and others  has been entrusted to the experience of scientists that manually highlight them on the  seabed images.

Considering the growth of available data, this approach cannot be followed anymore. Thanks to the huge amount of already analysed data a new approach based on AI techniques can be preferred.

During this two months internship the selected student will do some experiments to setup an automated methods to recognize these kinds of structures.

This is picture shows the Ionian Calabrian continental margin ( Ceramicola et al 2014). The colour layer (red to blue) is the morphobathymetric data acquired during the MAGIC project funded by the Department of Civil Protection to allow the realisation of the first maps of geohazards of the Italian continental margins. In legend are indicated the different geohazards features mapped by an interpreter.

Project Mentor: Silvia Ceramicola

Project Co-mentor: Veerle Huvenne

Site Co-ordinator: Massimiliano Guarrasi

Participants: Mario Udo Gaimann, Raska Soemantoro

Learning Outcomes:
The selected student will learn to manage oceanographic data to find the most important seabed structures.
Increase student’s skills about:
– PYTHON
– AI libs (e.g. PyTorch, Keras, TensorFlow, …)

Student Prerequisites (compulsory):
The student will need to have the following compulsory skills:

  • Python
  • Python main scientific libraries (at least basic knowledge):
    • Numpy
    • Pandas
    • Matplotlib
  • At least one of the following packages:
    • PyTorch
    • Tensorflow, Keras

Student Prerequisites (desirable):
Experience in image recognition methods

Training Materials:
None

Workplan:

  • Week 1: Training week with the other SoHPC students
  • Week 2: Training by CINECA (HPC facility, slurm batch scheduler, VNC, Blender, …) and OGS staff (How to read the data, how to visualize and explore them)
  • Week 3: Setup the workplan and start the work according with the student abilities and interest
  • Week 3 (end): Workplan ready.
  • Week 4-6: Continue the work on structures recognition
  • Week 7-8: Prepare the final video and report
  • Week 8 (end): The final video and report will be submitted

Final Product Description:
Our final result will be an automated tool to recognize seabed structures form oceanographic data. The selected student will also prepare a short video and a final report illustrating his/her work.

Adapting the Project: Increasing the Difficulty:
Depending on the student skills he/she can try to recognize many different type of structures.

Adapting the Project: Decreasing the Difficulty:
In the case the student will not be able to complete the desired tool, he/she will work on the visualization of the oceanographic data using the standard visualization tools

Resources:
The oceanographic data and the standard software to inspect them will be given by OGS. All the needed HPC software ( mainly Python, Pytorch, Keras, Fensorflow, Pandas, numpy) is released open source and already  available on the CINECA clusters that will be used by the students with their own provided accounts.

Organisation:
CINECA – Consorzio Interuniversitario

Project reference: 2110

High Energy Physics (HEP) community traditionally employed High Throughput Computing (HTC) type of facilities for the purpose of LHC data processing and various types of physics analyses. Furthermore, with recent convergence of AI and HPC, it becomes apparent that a single but modular and flexible type of facility could in the future replace single-purpose based environments. The goal of this project is to take several types of workloads: compute bound LHC event reconstruction and much more I/O driven physics analyses that could employ some type of ML or DL; and evaluate their effectiveness at an HPC scale. LHC type of workloads are mostly data-driven, which means that a lot of data has to be ingested and substantial amount of output produced. Doing this at a scale of a single or even tens of nodes does not represent a challenge. However, when moving to thousands of nodes, the handling of input and output becomes a huge bottleneck. HEP community as a whole currently evaluates various proof-of-concept designs of how this data flow should function. The idea is to scale these two very different workloads, but still both data driven, and understand the limitations of the existing model and also observe peculiarities of HPC systems under heavy dataflow load.

Project Mentor: Maria Girone

Project Co-mentor:Viktor Khristenko (CERN) and Sadaf Roohi Alam (CSCS)

Site Co-ordinator: Maria Girone (CERN) and  Joost VandeVondele (CSCS)

Participants: Carlos Eduardo Cocha Toapaxi, Andraž Filipčič

Learning Outcomes:

  • Learn about data driven sciences that are more and more present due to the increase of volumes of collected/recorded information.
  • Learn to scale out analytics applications using HPC facilities (this is very similar to what people do in industry with Hadoop/Apache Spark/other Big Data tools, the idea is quite similar)
  • Learn about available system monitoring at HPC facilities.

Student Prerequisites (compulsory):

  • Working knowledge of at least a single programming language
  • Bash, Linux
  • Familirity with/Some knowledge of Networking (TCP/IP)

Student Prerequisites (desirable):

  • working knowledge of Linux
  • Working knowledge of TCP/IP stack
  • Slurm

Training Materials:
https://ieeexplore.ieee.org/document/5171374 Paper describing the existing job submission schema for Grid computing.

Workplan:

  • Week 1-2: Learn about the existing workloads. How to run, what they do, etc. Learn about writing batch system submission scripts. In short, get started by running things on a smaller scale.
  • Week 3-4: Use LHC event reconstruction (mostly like CMS experiment event reconstruction) and scale it out to use more nodes. Observe the change on throughput as function of nodes. Learn how to monitor and understand the generated traffic. Test out various ways of doing I/O, e.g. reading directly from a file system or integrate into job submission that data has to be copied to the nodes. Devise the existing limitations for doing this at an HPC system
  • Week 5-6: Use LHC physics analysis. Similar to Week 3-4, however here the requirements for I/O are 10x higher at least.
  • Week 7-8: Finalize 

Final Product Description:

  • Scale out 2 representative workflows that different in input/output and compute requirements and observe how figure of merit changes, throughput, changes when going to higher node counts
  • Understand the impact of doing high throughput data analysis at an HPC facility. Understand the limitations. Using shared filesystem vs preplacing data on a node. Using outgoing/incoming HPC connection

Adapting the Project: Increasing the Difficulty:
Ideally we would not want an HPC site to host/store any of our data for a long period as it essentially means that HEP community loses full control of its data. On the other hand, doing data reading/writing over HPC’s external links could be costly. The idea is to move away from 2-tier architecture to N-tier by employing smart dynamic data caching on HPC site. If the workplan is too simple, we want to think about possible designs for such an architecture. In other words, after walking through 1)-4), we would like to understand patterns and possibilities to improve things.

Adapting the Project: Decreasing the Difficulty:
To decrease the difficulty of the project, we would just use LHC event reconstruction and scale a single application out.

Resources:

  • Access to an HPC site and support with co-mentoring a student and ensuring expertise from an HPC site.
  • Laptop

*Online only in any case

Organisation:
CERN

Project reference: 2109

High Energy Physics (HEP) community has a large number of various workloads (e.g. pure statistical analysis vs LHC event reconstruction) that differ not only by their requirements (compute vs I/O driven) but also by the capability of utilization of heterogeneous resources. For the purpose of standardization of these workflows and in order to simplify the process of evaluation of new computing platforms, a container-based benchmarking suite was developed that only recently started being tested at HPC facilities. The goal of this project is to contribute to this effort by adding new types of workflows to already available set and, at the same time, test out HEP workloads on new types of hardware like Nvidia GPUs, AMD CPUs and GPUs, ARM, etc. Furthermore, it would be interesting to understand how to combine efforts with PRACE benchmarking tools (i.e. UEABS).

Project Mentor: Maria Girone

Project Co-mentor:David Southwick (CERN) and Sagar Dolas (SURF)

Site Co-ordinator: Maria Girone (CERN) and Carlos Teiieiro Barjas (SURF)

Participants: Miguel de Oliveira Guerreiro, María Menéndez Herrero

Learning Outcomes:

  • A student will get an opportunity to get familiarized with containers and how this gets integrated into HPC facilities’ batch systems.
  • Experience using heterogenous computing architectures and working knowledge of running scientific workloads on these platforms.

Student Prerequisites (compulsory):

  • Working knowledge of at least one programming language
  • Bash, stating this explicitly rather than in 1)
  • Some familiarity with distributed systems (i.e. anything beyond a single node)

Student Prerequisites (desirable):

  • Some familiarity/knowledge of Containers
  • Some familiarity with Slurm

Training Materials:
https://docs.docker.com/get-started/overview/
Docker’s HowTo is quite good, although does not get used much at HPCs, it is still a good starting point to learn about containers: what they are and why they are needed

Workplan:

  • Week 1-2: Start by analyzing the existing source base for the benchmarking suite and how images for containers are generated.
  • Week 3-4: Incorporate several CPU/GPU based workloads into the suite and generate images for them.
  • Week 5-6: Test out the images at an HPC facility. Important to test out images on different architectures.
  • Week 7-8: Push all the developed updates upstream to the main repo. Write report/give presentations about the experience.

Final Product Description:

  • Integrate several new types of workloads (e.g. GPU-based workloads)
  • Provide performance figures of merit for the newly integrated workloads and for the existing ones that were not tested on the new type of hardware.

Adapting the Project: Increasing the Difficulty:
Of course just running the workload on a new type of hardware might sound too simple. To increase the difficulty, we would expect a student to not just run, but also to profile a given workload using whatever tools an HPC site provides for this purpose (i.e. we would expect more than just “perf” to be available)

Adapting the Project: Decreasing the Difficulty:
The difficulty could be decreased by only testing out the available images without diving into the details of a given container framework and how these images are to be generated. It should provide a basic overview of container usage at an HPC site.

Resources:

  • Access to an HPC site and support with co-mentoring a student and ensuring expertise from an HPC site.
  • Laptop

*Online only in any case

Organisation:
CERN

Project reference: 2108

Solution of independent-particle model (DFT or Hartree-Fock method) is one of the key targets of many quantum chemistry codes. Hartree-Fock method is iterative procedure which consist of two time-consuming steps. First, construction of Fock matrix from integrals and density matrix. Second, diagonalization of constructed Fock matrix.

Several years ago we effectively removed diagonalization step which was replaced by matrix-matrix multiplications. Very recently we implemented the solution of Hartree-Fock equations in localised orbitals, which opens the route toward a code applicable for very large molecules. To initialize the process, integrals are transformed to localized molecular orbital basis and are used during the entire run. Integrals matrix should be sparse by construction. Density matrix is updated in each step of the iterative procedure and for big systems it is sparse, as well. Currently, Fock matrix construction is implemented only in a sequential mode.

Students are expected to cooperate on implementation and performance testing of Fock matrix construction. Two different approaches will be implemented and tested.

First, BLAS version combined with OpenMP or MPI. Second we would like to implement Fock matrix construction using sparse matrix multiplications. Both algorithms will be implemented in local version of quantum chemistry program DIRCCR12OS which is mostly coded in fortran. It contains working MPI environment so students will be using several routines already implemented in the code.

Traditional Hartree-Fock method scheme for large-scale molecules calculations (in this case with phenylalanine peptide fragment as a test case).

Project Mentor: RNDr. Ján Simunek, PhD.

Project Co-mentor: Prof. Dr. Jozef Noga

Site Co-ordinator: Mgr. Lukáš Demovič, PhD.

Participants: Ioannis Savvidis, Eduárd Zsurka

Learning Outcomes:
Student will learn about Fortran and MPI environments. He/she will also get familiar with ideas of efficient use of tensor-contractions and parallel I/O in quantum chemistry algorithms.

Student Prerequisites (compulsory):
Background in quantum-chemistry or physics.

Student Prerequisites (desirable):
Advanced knowledge of Fortran, basic knowledge of MPI, BLAS libraries and other HPC tools.

Training Materials:
https://training.prace-ri.eu/index.php/prace-tutorials/

Workplan:

  • Week 1: training with existing code;
  • Weeks 2-3: introduction to fortran, MPI and theory of implemented Hartree-Fock method,
  • Weeks 4-7: implementation, optimization and extensive testing/benchmarking of the code,
  • Week 8: report completion and presentation preparation

Final Product Description:
We believe that the resulting code will be capable of successfully completing local Hartree-Fock calculations using at least several computational nodes. Codes will be benchmarked and compared with the previously used versions of HF calculations.

Adapting the Project: Increasing the Difficulty:
The goal is to push the efficiency of the MPI code(s) to maximum.

Adapting the Project: Decreasing the Difficulty:
Any successful implementation of local Hartree-Fock using MPI or with sparse matrix multiplications will be something new and acceptable from our side.

Resources:
Student will have access to the necessary learning material, as well as to our local IBM P775 supercomputer and x86 infiniband clusters.

Organisation:

CC SAS-Computing Centre, Centre of Operations of the Slovak Academy of Sciences

Project reference: 2107

Neural networks (NN) and deep learning are two success stories in modern artificial intelligence. They have led to major advances in image recognition, automatic text generation, and even in self-driving cars. NNs are designed to model the way in which the brain performs tasks or functions of interest. NNs can perform complex computations with ease.

“An essential paradigm of chemistry is that the molecular structure defines chemical properties. Inverse chemical design turns this paradigm on its head by enabling property-driven chemical structure exploration.” [1] Quantum chemistry provides powerful tools for investigation of molecular properties and their reactions. Rapid development of HPC has paved the road for chemists to utilize quantum chemistry in their everyday work, i.e. to understand, model, and predict molecular properties and their reactions, properties of materials at nano scale, and reactions and processes taking place in biological systems.

The main goal of this project is to investigate ability of NN frameworks to simulate molecular properties, where NN can emulate electronic wavefunction in local atomic orbitals representation as a function of molecular composition and atom positions or other molecular descriptors and representations. Our objective is to apply NN frameworks as predictor of molecular properties (energies, charges on atoms or evidence of hydrogen bonds) based on structural properties of these molecules (atomic positions). Implementation of NN frameworks will be performed using widely adopted TensorFlow library in Python code. For generation of molecular descriptors of chemical systems we apply DScribe library [2], which can be incorporated as a module directly in Python code. Next to the aforementioned “application part” of the project, we also plan to (in)validate the widely accepted fact, that GPGPUs are superior execution platform for NNs to CPUs.
[1] Schütt, K.T., Gastegger, M., Tkatchenko, A. et al. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat Commun 10, 5024 (2019)
[2] L. Himanen, M.O.J. Jäger, E.V. Morooka et al., DScribe: Library of descriptors for machine learning in materials science, Computer Physics Communications (2019)

Schematic diagram of descriptors (left) as inputs to the neural networks (right), along with hidden layers, and output.

Project Mentor: Ing. Marián Gall, PhD.

Project Co-mentor: Doc. Mgr. Michal Pitoňák, PhD.

Site Co-ordinator: Mgr. Lukáš Demovič, PhD.

Participants: Scott le Roux, Joseph Sleiman

Learning Outcomes:
Student will learn a lot about Neural Networks, molecular descriptors, TensorFlow, GPUs, MPI and HPC in general.

Student Prerequisites (compulsory):
Basic knowledge of Python and elementary chemistry/physics background.

Student Prerequisites (desirable):
Advanced knowledge of Python, C/C++ and MPI, BLAS libraries and other HPC tools. Basic knowledge neural networks and quantum chemistry background.

Training Materials:
https://www.tensorflow.org/,
https://singroup.github.io/dscribe/latest/

Workplan:

  • Week 1: training;
  • Weeks 2-3: introduction to neural networks, TensorFlow, molecular descriptors, quantum chemistry and efficient implementation of algorithms;
  • Weeks 4-7: implementation, optimization and extensive testing/benchmarking of the codes;
  • Week 8: report completion and presentation preparation

Final Product Description:
Expected project result is Python implementation of (selected) neural network algorithm, applied to quantum chemistry problem. Parallel code will be benchmarked and compared to GPU implementation.

Adapting the Project: Increasing the Difficulty:
Writing own NN algorithm using MPI and/or CUDA or build own molecular descriptor for molecular characterization.

Adapting the Project: Decreasing the Difficulty:
Applying existent NN implementation to quantum chemistry problems.

Resources:
Student will have access to the necessary learning material, as well as to our local IBM P775 supercomputer and x86 / GPU infiniband clusters. The software stack we plan to use is open source.

Organisation:

CC SAS-Computing Centre, Centre of Operations of the Slovak Academy of Sciences

Follow by Email