Visualization data pipeline in PyCOMPSs/COMPSs

Project reference: 1601 COMPSs is a programming model and runtime that aims at parallelizing sequential applications written in sequential programming languages (Java, Python, C/C++). The paradigm used to parallelize is the data-dependence tasks: the application developer selects the parts of

Tagged with: , ,

Smartphone Task Farm

Project reference: 1611 EPCC has developed a prototype for a smartphone app that demonstrates parallelism and task farms in a distributed computing environment. It is targeted at outreach events where visitors can join a compute “cluster” with their own smartphone by

The CFD devil is in CAD details

CAD models are programmed in OPEN CASCADE with different levels of detail, which are then, using high performance computing, meshed and imported in OpenFOAM to calculate fluid flow around the given object.

The usual procedure to simulate the fluid flow around the body is to take the CAD model, mesh it and run the CFD simulation. After analysing the results from the simulation, new design solutions are proposed. The model is changed according to the new solutions, meshed again and CFD simulation is run. After several iterations of CAD modelling, meshing and CFD simulations, final design is achieved.

Phine quarks and cude gluons

Simulations of Lattice Quantum Chromodynamics (the theory of quarks and gluons) are used to study properties of strongly interacting matter and can, e.g., be used to calculate properties of the quark-gluon plasma, a phase of matter that existed a few milliseconds after the Big Bang (at temperatures larger than a trillion degrees Celsius). Such simulations take up a large fraction of the available supercomputing resources worldwide.

Shape up or ship out – You decide!

Modern multi- and many-core hardware architectures provide a huge amount of floating point operations (FLOPs). However, CPU and GPU FLOPs cannot be harvested in the same manner. While the CPU is designed to minimize the latency of of a stream of individual operations, the GPU tries to maximize the throughput. At this stage, the code developer is faced with a decision to exhibit more parallelism of the algorithm to support the GPU execution path or optimize the code further for the CPU.

Journey to the centre of the human body

Objective of this project is to create interactive game inspired by movie “Innerspace”. In first stage 3D virtual model of human body from Computed Tomography (CT) or Magnetic Resonance (MR) images will be created. This model will be than imported into the game engine (Unity 3D). This engine will be customized, control system allowing player to move through 3D human body will be implemented. Outcome of this project could be used not only for propagation of HPC among general public but it could be also developed into more sophisticated system which could be later used for educational purposes of medical students.

Link prediction in large-scale networks with Hadoop framework

In this project, we will investigate how link prediction algorithms benefit from parallel computing in Apache Hadoop ecosystem. We will implement parallelized methodology for link prediction algorithms using MapReduce model. We will examine two types of topological features, namely neighbourhood based (e.g., common neighbors, Jaccard coefficient, Adamic/Adar coefficient) and random walk based (e.g., Katz, rooted PageRank, SimRank). We will assess the performance and validity of MapReduce implementation on several large-scale co-occurrence based biomedical networks. In addition, we will explore influence of network properties on the performance of link prediction by varying network parameters (size, density, degree distribution, clustering coefficient, etc.). Finally, we will provide a prototype application for visualization of predicted links in MEDLINE co-occurrence network.

Visualization of real motion of human body based on motion capture technology

Main objective of this project is to create platform for visualization of real motion of human body based on motion capture technology. First 3D virtual model of human skeleton will be created. This model will be moving based on real human motion obtained through motion capture technology. First model will be created manually using polygons. Second more sophisticated model will be generated based on Computed Tomography (CT) images. Outcome of this project could be used not only for promotion of HPC among general public but it could be also developed into more sophisticated system which could be later used by physicians for home treatment of patients with movement problems.

Development of a Performance Analytics Dashboard

An important part of HPC is understanding how optimisation carried out on application codes influences performance. Recently focus has shifted towards energy efficiency in the onset of “green” HPC clusters. At ICHEC we our continuously pushing the boundaries towards this goal. Our recent endeavour involves developing a special prototype system that can measure power consumption of different many-core technologies (e.g. GPUs, Intel Xeon Phi co-processors, FPGAs) used by many HPC applications. To make this prototype accessible to a wider audience this project will aim to build a performance analytics dashboard that potential developers can use to submit and/or analyse their applications using a variety of metrics such as power consumption and performance. The dashboard could also be extended to suit a larger compute cluster and to monitor further information about application behaviours.

Visualisation of fluids and waves

There are many simulation codes that model the flow of fluids under different environments at a variety of scales. A relative simple model consists of the so-called “shallow water equations”, where the horizontal length scale is great than the vertical length scale. While simple, these equations have been used to simulate both small-scale and large-scale fluid flow; from water in a bathtub to waves in the ocean.

Molecular Dynamics simulation of the E545K PI3Ka mutant

The kinase PI3Ka is involved in fundamental cellular processes such as cell proliferation and differentiation. PI3Ka is frequently mutated in human malignancies. One of the most common mutations is located in exon 9 (E545K), where a glutamic acid is replaced by lysine. The E545K mutation results in an amino acid of opposite charge, where the glutamic acid (negative charge) is replaced by lysine (positive charge). It has been recently proposed that in this oncogenic charge-reversal mutation, the interactions of the protein catalytic subunit with the protein regulatory subunit are abrogated, resulting in loss of regulation and constitutive PI3Ka activation, which can lead to oncogenesis. To test the mechanism of protein overactivation, MD simulations will be used here to examine conformational changes differing among the WT and mutant as they occur in microsecond simulations.

Re-ranking Virtual Screening results in computer-aided drug design

Virtual screening is a computational technique used in drug discovery to search libraries of small molecules in order to identify those compounds, which are most likely to bind to a drug target, typically a protein. A protein family is a group of evolutionarily-related proteins. When screening for drug candidates to identify inhibitors for a specific protein, the inhibitor should be very specific for the protein of interest and should not inhibit other proteins within the protein family. For example the Activin-Like Kinases (ALK1-7) or the PI3K isoforms (PI3K alpha, beta, gamma, delta) have a high homology but perform different functions. Therefore, we would like our candidate drug to be selective for the protein of interest and not for other homologs or isoforms of the protein family, because inhibiting other members of the protein family may lead to undesirable side effects.

Weather forecasting for outreach on Wee Archie supercomputer

We have just completed the building of Wee Archie, a mini supercomputer that represents ARCHER, the UK national supercomputer hosted by us in Edinburgh. Wee Archie is comprised of 18 Raspberry Pis (8 control cores and 64 computation cores) housed in a custom constructed casing with all the wires and networking hardware required for it to work and providing a minimum of set up time. The idea behind this is so it can be easily taken to outreach events where the public can gain an understanding of what a supercomputer is comprised of, illustration of parallelism and through examples of real world problems run on these machines how HPC is, without them knowing, central to their everyday lives. The hardware itself is complete, but more applications and demonstrations are needed for running on the machine in order to have the impact that we want.

In Situ or BAtch VIsualization of biogeochemical state of the Mediterranean Sea

Reanalysis (i.e. multi-decadal simulations with data assimilation) of the biogeochemical state of the Mediterranean Sea are an important tool to study the nutrient and carbon cycles characterizing the Mediterranean, considered as a “hot spot” for climate change (Lazzari et al., JMS 135, 2014; Giorgi, GRL 33, 2006).
Due to the lack of sufficient in-situ observational data and the length of the simulations carried out with OGSTM-BFM model, necessary to resolve the proper time scales, researchers have to tune the model setup and particularly the boundary conditions (BCs), to which the model is very sensitive.

In Situ VIsualizzation of NAvier-Stokes Tornado Effect

The starting point of this work is the numerical study of a particular class of solutions of the 3d incompressible Navier-Stokes equations suggested by the theoretical work of Li and Sinai who proved the existence of a blow up for complex-valued solutions with suitable initial data.

Parallelising Scientific Python applications

The main aim of this project is to extend the course by parallelising the example applications within it. This would primarily be done using MPI and the mpi4py Python package, although other ways of introducing parallelisation will also be considered. By ensuring the application codes can run on ARCHER, the UK’s National Supercomputing service, the student will gain experience using a large HPC facility such as ARCHER.

Apache Spark: Bridge between HPC and Big Data?

Student is expected to cooperate on implementation and performance testing of simple quantum chemistry method(s), such as Hartee-Fock/DFT or the Second-order Møller-Plesset perturbation theory using Apache Spark, general-purpose engine for large-scale data processing (alternative to Hadoop/MapReduce). Despite the fact that Spark runs on top of JVM (Java Virtual Machine), thus can hardly match the FPO performance of Fortran/C(++) programs compiled to native machine code, it has many desirable features of (distributed) parallel application: fault-tolerance, node-aware distributed storage, caching or automated memory management. Yet we are curious, how far we can push the performance of Spark application by, e.g. substituting critical parts with compiled native code or by using efficient BLAS-like libraries.

Calculation of nanotubes by utilizing the helical symmetry properties

In calculations of nanotubes prevail methods based on a one-dimensional translational symmetry using a huge unit cell. A pseudo two-dimensional approach, when the inherent helical symmetry of general chirality nanotubes is exploited, has been limited to simple approximate model Hamiltonians. Currently, we are developing a new unique code for fully ab initio calculations of nanotubes that explicitly uses the helical symmetry properties. Implementation is based on a formulation in two-dimensional reciprocal space where one dimension is continuous whereas the second one is discrete. Independent particle quantum chemistry methods, such as Hartee-Fock and/or DFT or simple post Hartree-Fock MP2 are used to calculate the band structures.

Mixed-precision linear solvers for lattice QCD

This project will involve optimising lattice Quantum Chromodynamics codes, which currently run on PRACE Tier-0 and other European Peta-scale supercomputers. The specific optimisation targeted involves using mixed precision, i.e. combined half, single and double precision arithmetic in the iterative solvers currently employed. The student will implement the set of extra functions required in an existing code, and run the code, check for correctness, tune the parameters of the solver and analyse the improvement in the performance.

Topological susceptibility by direct calculation of the eigenmodes

The topological susceptibility probes properties of the rich QCD vacuum. It is a crucial quantity to measure the topological fluctuations of the QCD vacuum, which plays an important role in breaking the UA(1) symmetry, and therefore is connected to the mass of the η’.

Development of sample application in PyCOMPSs/COMPSs

COMPSs is a programming model and runtime that aims at parallelizing sequential applications written in sequential programming languages (Java, Python, C/C++). The paradigm used to parallelize is the data-dependence tasks: the application developer selects the parts of the code (methods, functions) that will become tasks and indicates the directionality of the arguments of the tasks. With this information, the COMPSs runtime builds a data-dependence task graph, where the concurrenty of the applications at task level is inherent. The tasks are then executed in distributed resources (nodes in a cluster or a cloud). The runtime implement other features such as data transfer between nodes or elasticity in the cloud.

Tagged with: , , ,
Follow by Email27