Development of sample application in PyCOMPSs/COMPSs
Project reference: 1602
COMPSs is a programming model and runtime that aims at parallelizing sequential applications written in sequential programming languages (Java, Python, C/C++). The paradigm used to parallelize is the data-dependence tasks: the application developer selects the parts of the code (methods, functions) that will become tasks and indicates the directionality of the arguments of the tasks. With this information, the COMPSs runtime builds a data-dependence task graph, where the concurrenty of the applications at task level is inherent. The tasks are then executed in distributed resources (nodes in a cluster or a cloud). The runtime implement other features such as data transfer between nodes or elasticity in the cloud.
COMPSs has recently being extended to support Python codes, through its Python binding (PyCOMPSs) and it is being integrated with new strategies to store data (Hecuba with Cassandra DB, and dataClay a persistent objects library).
The objective of this SoHPC proposal will be to port an application to PyCOMPSs/COMPSs, preferently one that can generate some graphical result. We would prefer that the intern has a sequential application to be parallelized (optimally written in Python, although Java would be a good option as well).
Alternatively, we can design an application to be ported by the intern.
The intern will have to design the COMPSs solution, selecting those parts of the code that will become a task. This can be done in an iterative way, starting from a naïve solution that will be refined step by step. In each step, besides performance numbers, information about the quality of the solution can be observed with the COMPSs monitor and with Paraver performance traces. Using this tool, potential bottlenecks and issues preventing the application to scale will be detected.
The executions could be done in the MareNostrum supercomputer and in the Workflows and Distributed Computing group private cloud.
Project Mentor: Rosa Badia
Site Co-ordinator: Maria Ribera Sancho
Student: Marco Forte
The student will learn to program applications following PyCOMPSs/COMPSs programming model. The student will learn how to execute applications that run in parallel in clusters and clouds.
Student Prerequisites (compulsory):
Advanced programming in Python or Java
Expertise in the area of application used in the project (if any provided by the intern)
Student Prerequisites (desirable):
If the intern has a sequential application, it can be parallelized during the project.
There is a tutorial material available in the COMPSs website:
Week 1/: Training week
Week 2/: Literature Review Preliminary Report (Plan writing)
Week 3 – 7/: Project Development
Week8/: Final Report write-up
Final Product Description:
The final product will be a parallel application that can be executed in supercomputers. The visual part can be videos of the execution of the application, taken from our monitor or, if the application have visual results, videos or images from the application can be used.
Adapting the Project: Increasing the Difficulty:
The project is on the appropriate cognitive level, taking into account the timeframe and the need to submit final working product and 2 reports.
The student will need access to standard computing resources (laptop, internet connection) as well as an account in Marenostrum
Barcelona Supercomputing Centre