Encrypted volumes for PCOCC private clusters
Project reference: 1920
There is an increasing demand for processing confidential data, such as the genetic makeup of humans, personal data, and data that, if leaked or modified, could have serious legal consequences. Normally, supercomputers and clusters are shared by many users, which makes it difficult to meet strict security requirements. Furthermore, network restrictions hinder the flexibility and the open character that make clusters popular.
PCOCC (Private Cloud On a Compute Cluster) is a promising technology that enables creating private HPC-clusters on existing clusters, by automating the setup of KVM virtual machines, connected with an overlay network. These guest private clusters can be tailored to the security requirements per project, whilst maintaining the flexibility and openness of the host cluster.
A security weakness in the current PCOCC software is that disk images that are used for storing persistent cluster data are not encrypted. Because the disk images reside on the host cluster file systems, confidential data could be read by users with enough privileges on the system. Having encrypted volumes will greatly increase the usability of PCOCC for all applications where high levels of security are required.
The underlying virtualisation technology, KVM, does support encrypted volumes, so adding the possibility of using encrypted volumes will involve a modification in the PCOCC software. Luckily, PCOCC is written in Python, Open-source and allows modification.
During the SoHPC, you will make steps towards adding encrypted volumes to the PCOCC software. If successful, you will test the usability and performance of PCOCC clusters with encrypted disk images. To complete this task, you will need to be at least familiar with Python programming and have a keen interest in cluster computing, and virtualization technologies.
Project Mentor: Lykle Voort
Project Co-mentor: /
Site Co-ordinator: Lykle Voort
Participant: Kara Moraw
Learning Outcomes:
In this internship, you will learn about implementing a new feature in an existing, real-world application, in a structured way. Furthermore, you will get a broad experience in how to compute clusters work, and you will become familiar with various aspects of virtualization.
Student Prerequisites (compulsory):
- Linux on a user level
- Python programming
- Virtualisation techniques
- Some system maintenance
- Unix/Linux networking
Student Prerequisites (desirable):
Linux cluster usage and maintenance
Training Materials:
Workplan:
- Week 1: setting up, getting familiar with the new location
- Week 2-3: Literature study; workplan (deliverable)
- Week 4-6: implementation; documentation
- Week 7: benchmarking; evaluation
- Week 8: final project report
Final Product Description:
The final result of this project, if successful, will be the added possibility of using encrypted volumes to virtual clusters.
Adapting the Project: Increasing the Difficulty:
The difficulty can be increased by further testing.
Adapting the Project: Decreasing the Difficulty:
If the goal of the project is too difficult, another approach is further automating the deployment of private clusters, with various options for security and/or features.
Resources:
The student will need access to a cluster with Intel Skylake and Intel Knights Landing systems (provided by us), standard computing resources (laptop) as well as an account on the Cartesius supercomputer (provided by us).
Organisation:
SURFsara
Leave a Reply