Performance visualization for bioinformatics pipelines

Performance visualization for bioinformatics pipelines

Project reference: 1715

Supercomputers can help speed up the drug discovery using machine learning. Within the project the student will work with our tool that deploys new programming model and scheduler for running complex pipelines in a distributed environment on an HPC cluster. The goal of the project is to bring performance information from workflow pipelines to the users. The student’s work is to bring more detailed information about performance of the pipeline to the users via own visualization or exports to a tool for performance analyses.

Attached picture shows protein and directed acyclic graph of small cross-validation example

Project Mentor: Jan Martinovič

Site Co-ordinator: Karina Pešatová

Learning Outcomes:

Basic knowledge of processing and visualization of results from performance analysis of machine learning pipelines used for  example for drug discovery

Student Prerequisites (compulsory): 

Basic programming skills in

  • C, C++
  • Python

Student Prerequisites (desirable): 

  • 2D data visualization
  • Parallel processing

Training Materials:

http://www.mcs.anl.gov/research/projects/perfvis/software/viewers/

C++: http://www.cplusplus.com/doc/tutorial/

Python: https://www.python.org/doc/

Workplan:

Week 1: Training

Week 2: Work plan setting

Week 3 – 6: Implementation of software tool for performance information extraction and visualisation from workflow pipelines by own implementation or exports to a tool for performance analyses

Week 7: Visualization of the results

Week 8: Final report completion and final presentation preparation

Final Product Description: 

visualisation from workflow pipelines

Adapting the Project: Increasing the Difficulty:

In the original setting, the project is focused on post-mortem analysis. The complexity can be increased by providing the visualization in realtime.

Resources:

 

Software

  • Python
  • C, C++ programming environment

Hardware

  • Salomon cluster

Access to the appropriate software and hardware will be provided by the IT4Innovations National Supercomputing Center.

Organisation:
IT4Innovations national supercomputing center
project_1615_logo-it4i

Tagged with: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.