Distributed Memory Radix Sort
Project reference: 1915
A Radix Sort is an array sorting algorithm that can run in O(N) time, compared with typical sorting algorithms like QuickSort which run in O(Nlog(N)) time. Typical sorting algorithms sort elements using pairwise comparisons to determine ordering, meaning they can be easily adapted to any weakly ordered data types. The Radix Sort, however, uses the bitwise representation of the data and some bookkeeping to determine ordering, limiting its scope. In the case of numerical simulation, it is typically a combination of integer and floating point data being sorted, meaning a bitwise representation is accessible. Another potential issue is the stack size used. However, with a sufficiently clever implementation, it may be possible to avoid this issue.
This project aims to implement a Radix Sort in MPI for an array distributed across a number of processes. Initial implementations will look at sorting fixed-width integers, and then moving on to arbitrary bit-wise representable data. Following on, supporting various array distribution and chunking approaches can be investigated. Upon achieving sufficient usefulness, functionality, and performance, the project can be wrapped up into a library (finding immediate use in the ExSeisDat project).
Project Mentor: Paddy Ó Conbhuí
Project Co-mentor: /
Site Co-ordinator: Simon Wong
Participant: Jordy Innocentius Ajanohoun
Solving problems using parallel and recursive decomposition in MPI. Writing clean and reusable C for use in other HPC projects.
Student Prerequisites (compulsory):
Intermediate C, Basic familiarity with MPI.
Student Prerequisites (desirable):
- Familiarity with recursive algorithms.
- Familiarity with sorting algorithms.
- Familiarity with Scan, Reduction, and AllToAll algorithms in MPI.
- Familiarity with creating sub-communicators in MPI.
- Experience with Linux.
C++Now 2017: M. Skarupke “Sorting in less than O(n log n): Generalizing and optimizing radix sort” https://www.youtube.com/watch?v=zqs87a_7zxw
- Week 1: Training
- Week 2: Learn about Radix Sort, serial implementations, distributed memory approaches. Set up an initial project, attempt initial implementation.
- Week 3: Write Plan
- Week 4: Implement Radix Sort for int8, int16, int32, and int64
- Week 5: Implement Radix Sort for float, double, and user-defined functions returning a bitwise representation of their data.
- Week 6: Profile + Optimize implementations.
- Week 7: Write Report
- Week 8: Write Report
Final Product Description:
A reusable library with a clean interface that uses Radix Sort to sort a distributed array using MPI.
Adapting the Project: Increasing the Difficulty:
The student could attempt to incorporate it into the ExSeisDat project, or go quite deep into the optimization of the sorting and communications.
Adapting the Project: Decreasing the Difficulty:
The student could stop at implementing int8, or int16, or anywhere along that path, and move on to profiling.
- A desk. There’s one available next to Dr. Ó Conbhuí.
- A computer. The student should bring their own.
- Access to a distributed memory machine for profiling the performance of the sorting routine. (Can access to / time on Kay be arranged?)
Irish Centre for High-End Computing